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Abstract 


Finite  state  models  of  concurrent  systems  grow  exponentially  as  the 
number  of  components  of  the  system  increases.  This  is  known  widely  as 
the  state  explosion  problem  in  automatic  verification,  and  has  limited 
finite  state  verification  methods  to  small  systems.  To  avoid  this  prob¬ 
lem.  a  method  called  symbolic  model  checking  is  proposed  and  studied. 
This  method  avoids  building  a  state  graph  by  using  Boolean  formulas 
to  represent  sets  and  relations.  .\  variety  of  properties  characterized  by 
least  and  greatest  fi.xed  points  can  be  verified  purely  by  manipulations 
of  these  formulas  using  Ordered  Binary  Decision  Diagrams. 

Theoretically,  a  structural  class  of  sequential  circuits  is  demon¬ 
strated  whose  transition  relations  can  be  represented  by  polynomial 
space  OBDDs.  though  the  number  of  states  is  exponential.  This  re¬ 
sult  is  born  out  by  experimental  results  on  example  circuits  and  sys¬ 
tems.  The  most  complex  of  these  is  the  cache  consistency  protocol  of  a 
commercial  distributed  multiprocessor.  The  symbolic  model  checking 
technique  revealed  subtle  errors  in  this  protocol,  resulting  from  com¬ 
plex  execution  sequences  that  would  occur  with  very  low  probability  in 
random  simulation  runs. 

In  order  to  model  the  cache  protocol,  a  language  was  developed  for 
describing  sequential  circuits  and  protocols  at  various  levels  of  abstrac¬ 
tion.  This  language  has  a  synchronous  dataflow  semantics,  but  allows 
nondeterminism  and  supports  interleaving  processes  with  shared  vari¬ 
ables.  A  system  called  .S.\1V'  can  automatically  verify  programs  in  this 
language  with  respect  to  temporal  logic  formulas,  using  the  symbo'ic 
model  checking  technique. 

technique  for  proving  properties  of  inductively  generated  (  lasses 
of  finite  state  systems  is  also  fleveloped.  The  proof  is  checked  automat¬ 
ically,  but  requires  a  us<'r  supplied  [irocess  called  a  p^ocr.^.s  invariant 
to  act  as  an  inductive  hypothesis.  .\n  invariant  is  developed  for  the 
distributed  cache  protocol,  allowing  properties  of  systems  with  an  ar- 
l)itrary  number  of  processc^rs  to  l)e  proved. 

Finally,  an  alternative  method  is  clevelopefl  for  avoiding  the  state 
(explosion  in  the  ca.se  of  asynchronous  control  circuits.  This  technique 
is  based  the  unfolding  of  Petri  nets,  and  is  used  to  check  for  hazards  in 
a  distribntefi  nnilnal  exclusion  circuit. 
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Chapter  1 
Introduction 


There  are  several  practical  reasons  tor  applying  formal  verification  meth¬ 
ods  to  computer  systems.  The  most  obvious  is  the  high  cost  of  correct¬ 
ing  errors  in  digital  designs.  This  cost  has  been  increaising  with  the  ris¬ 
ing  level  of  integration  in  digital  circuit  technology.  It  can  be  decreased 
to  an  extent  in  application  specific  designs  by  the  use  of  programmable 
device  technologies,  but  at  least  for  the  present,  programmable  logic 
has  distinct  disadvantages  in  performance  and  area.  Thus,  there  is  a 
growing  demand  for  design  methodologies  that  can  yield  correct  de¬ 
signs  on  the  first  fabrication  run.  Design  errors  that  are  discovered 
i)efore  fabrication  can  also  be  quite  costly,  however,  in  terms  of  the  en¬ 
gineering  effort  required  to  correct  the  error,  and  the  resulting  impact 
on  de'rlopment  schedules.  .At  present,  the  best  tools  available  to  engi¬ 
neers  for  finding  errors  l)efore  fabrication  are  simulators,  which  model 
(he  behavior  of  a  system  for  [jredetermined  or  random  input  [)atterns. 
The  engineer  using  simulation  is  facerl  with  two  ill-characterized  and 
increasingly  intractable  (problems.  The  first  is  creating  a  .set  of  input 
[)atterns  that  are  snlFicient  to  expose  any  incorrect  behavior  of  the  sys¬ 
tem.  and  the  second  is  determining  the  correct  output  of  the  system 
under  these  -  onditions.  to  be  compared  with  the  simulated  output.  In¬ 
creased  density  of  integration  has  allowed  higher  level  functions  such  as 
network  protocols  to  be  implemented  in  hardware,  and  as  a  result,  the 
problems  of  simulation  have  become  critical.  What  seems  to  be  needed 
is  a  precise  yet  understandable  way  of  specifying  correct  l)ehavior.  ancl 
an  exhaustive  method  of  dt'lermining  that  the  system  model  satisfies 
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this  specification  for  all  input  patterns.  This  is  the  meaning  of  formal 
verification. 

.\  formal  verification  framework  has  three  basic  elements  -  a  math¬ 
ematical  model  of  the  system  to  be  verified,  a  formal  language  i  fram¬ 
ing  the  correctness  problem,  and  a  methodology  for  proving  the  state¬ 
ment  of  correctness.  One  characteristic  that  many  automatic  verifica¬ 
tion  methodologies  have  in  common  is  that  they  require  an  exhaustive 
search  of  the  state  space  of  the  model.  Owing  to  simple  combinatorics, 
the  size  of  this  state  space  can  be.  and  usually  is.  exponential  in  the 
size  of  the  system  being  modeled.  This  exponential  growth  in  the  state 
space,  known  as  the  .state  explosion  problem  is  the  limiting  factor  in 
applying  automatic  verification  methodologies  to  large  systems. 

This  thesis  is  directed  toward  solutions  for  the  state  explosion  prob¬ 
lem.  This  is  essentially  a  c(uestion  of  methodology,  but  before  we  can 
discuss  methodology,  we  need  to  discuss  some  of  the  niodels  and  for¬ 
malisms  that  are  commonly  used  in  formal  verification  of  hardware. 


1.1  Background 

The  problem  of  hardware  verification  is  in  some  ways  similar  to.  and 
in  other  ways  different  from  the  problem  of  proving  correctness  of  pro¬ 
grams.  Digital  systems  are  most  similar  to  what  Pnueli  has  charac¬ 
terized  as  reactive  programs  [Pnu86],  in  that  they  receive  input  and 
produce  output  in  a  continuous  interaction  with  their  environment, 
rather  than  computing  a  single  result  and  halting.  In  additi  m.  the  be¬ 
havior  of  digital  systems  is  concurrent  in  the  extreme,  sinci  every  gate 
in  the  system  is  sinudtaneously  evaluating  its  output  as  a  function  of 
its  inputs. 


1.1.1  Temporal  logic 

For  reasoning  about  concurrent,  reactive  programs.  Pnueli  proposed 
the  use  of  a  formal  system  originally  studied  by  philosophers,  called 
temporal  logic  [PnuTT.  Pnu8f).  .\1P81.  Kro87].  In  a  temporal  logic,  the 
usual  operators  of  propositional  logic  are  augmenteil  by  h  nsi  operators. 
which  are  used  to  form  assertions  about  changes  in  lime.  One  can 
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assert,  for  example,  that  if  proposition  p  holds  in  the  present,  then 
proposition  q  holds  at  some  instant  in  the  future,  or  at  some  instant  in 
the  past.  The  temporal  modalities  can  be  combined  to  express  fairly 
complex  statements  about  past,  present  and  future.  For  example  “if  p 
holds  in  the  present,  then  at  some  instant  in  the  future,  p  will  have  held 
in  the  past.’’  .4  temporal  system  provides  a  complete  set  of  axioms  and 
inference  rules  for  proving  all  validities  in  the  logic  for  a  given  model  of 
time,  such  as  partially  ordered  time,  linearly  ordered  time,  dense  time, 
and  even  branching  time. 

Temporal  logic  is  powerful  enough  to  define  a  semantics  for  pro¬ 
grams  which  captures  not  only  the  traditional  before  and  after  con¬ 
ditions  of  Floyd- Hoare  style  program  proving,  but  also  a  wide  variety 
of  temporal  properties  of  programs,  such  as  termination,  possible  ter¬ 
mination.  termination  under  fair  scheduling  of  concurrent  processes. 
etc.  [CE81a.  BAMP81].  In  the  hardware  area.  Malachi  and  Owicki 
used  temporal  logic  to  give  a  concise  specification  of  the  conditions 
necessary  for  an  asynchronous  circuit  to  be  speed  independent  [M081]. 
Bochmann  used  temporal  logic  to  give  a  semantics  for  self  timed  cir¬ 
cuits,  and  used  this  system  to  verify  a  corrected  version  of  an  arbiter 
circuit  [Sei80a].  Formal  proofs  of  this  kind  are  extremely  tedious  and 
difficult,  however,  and  computationally  intractable  to  automate.  To 
simplify  the  hand  proof.  Bochmann  used  a  somewhat  oversimplified  se¬ 
mantics  for  the  circuit  elements  (neglecting  gate  delay)  and  as  a  result, 
missed  a  bug  iu  the  design,  which  was  demonstrated  by  Dill  [D(.'86]. 

A  more  practical  application  of  temporal  logic  in  hardware  veri¬ 
fication.  called  model  rlurkuuj.  was  introduced  by  Clarke  and  Emer¬ 
son  [CESlbj  and  independently  l)y  Quielle  and  .Sifakis  [Q.S81].  In.stead 
of  proving  the  validity  of  a  logical  formula  for  all  models,  a  model 
checker  determines  the  truth  v^alue  of  the  formula  in  a  specific  finite 
model.  For  branching  time  logic,  the  model  checking  problem  is  com¬ 
putationally  tractable,  even  though  the  validity  problem  is  intractable. 
Here  an  important  distinction  between  hardware  and  software  systems 
comes  into  play  hardware  systems  are  finite-state.  This  allows  the 
proof  procedure  to  i)e  automated  using  model  checking,  while  maintain¬ 
ing  the  formal  elegance  of  temporal  logic  for  specifying  correct  behavior. 

The  metho<l  of  Clarke  and  Emerson  first  builds  a  complete  stale 
graph  of  the  svstem  from  a  description  in  an  appro[)riate  languag<'. 
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The  truth  value  of  a  formula  in  the  logic  is  determined  by  an  algorithm 
which  propagates  formulas  in  this  state  graph  until  a  fixed  point  is 
reached.  Besides  being  fast  and  fully  automatic,  this  technique  has  the 
advantage  that  it  can  produce  state  sequences  as  counterexamples  when 
the  formula  being  checked  is  false.  This  has  made  it  possible  to  find 
bugs  in  a  number  of  small  but  fairly  subtle  circuit  designs  [BCDM86, 
BCD86].  including  the  one  verified  by  Bochmann. 

For  linear  time  logic,  there  is  a  decision  procedure  that  translates  a 
formula  into  an  automaton  by  means  of  a  tableau  construction  [RU71, 
CE81b,  BAMP81].  This  construction  is  similar  the  the  semantic  tab¬ 
leaux  method  of  constructing  proofs  in  standard  logic  [Smu68].  Each 
state  in  the  tableau  is  associated  with  a  set  of  formulas  which  are  true 
in  that  state.  Since  the  number  of  states  in  the  tableau  is  exponential 
in  the  size  of  the  formula,  the  method  is  not  practical  for  proofs  about 
very  large  systems.  However,  the  tableau  method  can  be  used  in  a 
model  checking  framework,  yielding  an  algorithm  which  is  exponentizd 
in  the  size  of  the  formula  but  linear  in  the  size  of  the  model  [LP85]. 

1.1.2  Automata  theoretic  models 

An  alternative  to  the  temporal  logic  framework  is  to  cast  the  correct¬ 
ness  problem  in  terms  of  a  relation  between  the  external  or  observable 
Ijehaviors  of  two  processes.  One  way  to  define  this  relation  is  by  con¬ 
sidering  the  set  of  all  possible  sequences  of  communications  l^etween 
processes.  For  example,  in  the  1-automata  model  of  Kurshan  [Kur86]. 
these  sequences  are  defined  by  the  language  of  an  ^.’-automaton.  Cor¬ 
rectness  is  framed  as  the  containment  of  the  language  of  one  automaton 
in  the  language  of  another.  This  asymmetric  relation  makes  it  possible 
to  "underspecify"  a  system,  that  is.  to  leave  some  choices  open  to  the 
designer.  The  use  of  automata  on  infinite  strings  makes  it  possible  to 
express  liveness  properties.  For  instance,  one  can  easily  construct  an 
automaton  who.se  language  is  the  set  of  all  infinite  strings  such  that 
every  lime  a  message  is  sent  on  some  channel,  one  is  eventually  re¬ 
ceived.  Language  containment  between  u,-automata  can  be  established 
by  an  algorithm  which  searches  for  cycles  in  the  state  space  of  a  product 
automaton. 

Van  de  .Snepscheut  [vdS83]  and  Dill  [Dil88]  have  n.sed  trace  the- 
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ory  to  model  speed  independent  circuits  [SeiSOb].  A  trace  is  simply  a 
history  of  the  communications  between  a  process  and  its  environment. 
The  trace  sets  of  two  process  can  be  combined  in  a  way  which  mod¬ 
els  communication  between  the  two  processes  by  synchronizing  signals 
sent  and  received  on  the  same  channel.  Dill’s  system  is  a  circuit  alge¬ 
bra  which  has  both  a  structural  interpretation  (describing  the  physical 
connection  of  wires)  and  a  trace  theoretic  interpretation  (describing 
the  communications  along  those  wires).  The  actual  trace  sets  are  de¬ 
fined  by  the  languages  of  finite  automata  (in  this  case,  automata  on 
finite  strings,  hence  liveness  cannot  be  modeled).  A  relationship  called 
conformance  between  two  processes  determines  when  one  process  can 
safely  be  substituted  for  the  other  in  all  environments.  Conformance 
can  be  tested  by  a  polynomial  algorithm  which  searches  the  state  space 
of  a  finite  automaton  derived  from  the  two  processes. 

In  the  Calculus  of  Communicating  Systems  (CCS)  [MilSO],  Milner 
takes  a  different  approach  in  which  external  behavior  is  modeled  by  a 
tree  rather  than  a  set  of  seciuences.  The  way  CCS  models  communi¬ 
cation  is  not  well  suited  to  modeling  hardware,  since  in  CCS  a  signal 
cannot  be  sent  until  a  receiver  is  ready  to  receive  it.  In  hardware,  a  re¬ 
ceiver  cannot  generally  prevent  a  signal  from  being  sent.  .Also,  in  CCS. 
communication  is  always  between  two  processes,  while  in  hardware  sig¬ 
nals  are  often  broadcast  to  many  receivers.  .A  calculus  specialized  to 
circuits  called  CIR(,'.AL  [Mil83j  was  developed  to  remedy  these  prob¬ 
lems.  The  notion  of  correctness  in  process  calculi  is  called  observational 
equivalence,  meaning  that  an  observer  cannot  distinguish  between  two 
processes  by  any  experiment.  This  notion  of  correctness  is  extremely 
strict,  since  it  doesn'l  allow  the  specifier  to  leave  any  clioice  up  to 
the  designer  regarding  the  externally  visible  l)ehaviors.  Observational 
ecjuivalence  can  be  proved  by  establishing  a  relation  called  bisirnula- 
tion  between  the  two  |)roc('sses.  For  finite  state  proces.ses.  there  is  an 
polynomial  time  algorithm  for  bisimulation  which  is  very  similar  to 
the  coarsest  partitioning  algorithms  used  for  state  machine  minimiza¬ 
tion  [NH84]. 

.All  of  these  methods  can  be  viewed  as  variations  on  the  theory  of 
finite  automata,  tailored  for  modeling  certain  properties  of  a  particu¬ 
lar  class  of  systems.  In  fact,  the  automata  theoretic  approach  is  not 
very  far  from  the  tfunporal  logic  approach  in  |)ractice.  The  differerux' 
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is  mostly  a  question  of  notations,  since  the  tableau  method  provides  a 
way  of  translating  a  temporal  logic  formula  into  an  automaton.  .Al¬ 
though  temporal  logic  is  not  as  expressive  as  automata  in  characteriz¬ 
ing  cla.sses  of  sequences,  it  has  been  shown  by  Wolper  that  temporal 
logic  can  be  extended  using  right  linear  grammars  to  make  it  as  ex¬ 
pressive  as  automata  without  increasing  the  complexity  of  the  decision 
procedure  [VVolSd].  Clarke  and  Kurshan  have  also  proposed  a  branch¬ 
ing  time  logic  in  which  the  temporal  operators  are  defined  by  finite 
a-'-automata  [(.'GKSy]. 

What  all  of  the  above  systems  have  in  common  is  that  correctness, 
once  formalized,  can  lie  determined  by  an  algorithm  that  searches  the 
entire  state  space  of  a  finite  state  model.  Such  methods  have  the  ad¬ 
vantage  of  being  fully  automatic,  but  invariably  suffer  from  the  state 
explosion  problem. 


1,2  Scope  of  the  thesis 

This  thesis  explores  methods  of  state  space  search  that  avoid  the  state 
explosion  problem  by  not  explicitly  representing  the  states  of  the  model. 
To  do  this,  some  revolutionary  new  techniques  are  borrowed  from  the 
area  of  switching  function  analysis.  In  this  domain,  a  combinational  e.x- 
plosioii  also  arises,  since  the  number  of  input  combinations  to  a  Boolean 
function  is  exponential  in  the  uuml)er  of  inputs.  .\ew  techniques  for 
Boolean  comparison  avoid  this  problem  by  representing  Boolean  func¬ 
tions  with  a  reduced  form  of  decision  graph  called  an  Ordered  Binary 
Decisiun  Diagram  (OBDD)  [BrySb].  These  derision  grai)hs  provide  a 
I'ompari  <  anonical  form  tor  Boolean  lunctions.  To  ap|)ly  this  idea  to 
temporal  verilication.  we  observe  that  if  the  state  of  a  system  is  rep¬ 
resented  by  a  vector  of  Boolean  variables,  then  a  set  of  slates  can  be 
represented  by  a  Boolean  function  which  returns  true  for  all  states  in 
the  set.  Similarly,  a  relation  jRi/  between  state's  can  be  repn'senterl 
by  a  Boolean  function  of  two  sets  of  variables,  one  set  representing  .r 
and  'he  ol  lu'r  re'preseuiting  ij.  In  this  way.  a  model  checking  algorithm 
ran  be  d('velo|)ed  which  u.ses  OBDDs  to  represe'iit  se'ts  and  re'lations. 
Borrowing  terminology  from  Bryant,  this  techni(|ue  is  called  ■'Hmholir 
nuxh'l  ( liecking.  since  symbolic  variable's  are  iise'<l  to  re'pre'se'nt  the'  com- 
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ponents  of  the  system  state  rather  than  numeric  values.  Using  symbolic 
model  checking,  we  can  can  automatically  verify  some  regularly  struc¬ 
tured  systems  with  literally  astronomical  numbers  of  states. 

The  principle  contributions  of  this  work  are  detailed  below: 

The  symbolic  model  checking  method.  A  technicjue  is  developed  for 
state  space  search  using  Ordered  Binary  Decision  Diagrams.  We  show 
that  any  algorithm  that  can  be  expressed  in  a  fixed  point  logic  called 
the  Mu-Calculus  can  be  computed  using  this  method.  These  include  al¬ 
gorithms  for  all  of  the  correctness  notions  enumerated  above,  including 
CTL  model  checking  (with  fairness  constraints),  the  linear  time  tableau 
method,  conformance,  observational  ec|uivalence.  language  containment 
for  u-’-automata.  Mealy  machine  equivalence,  and  others.  From  a  the¬ 
oretical  point  of  view,  a  structural  class  of  setpiential  circuits  is  iden¬ 
tified  whose  transition  relations  can  be  represented  by  a  polynomiallv 
bounded  OBDDs.  This  theoretical  result  is  born  out  by  experiments  on 
classes  of  regularly  structured  circuits,  for  which  the  time  used  by  the 
symbolic  model  checking  method  is  found  to  be  polynomially  bounded 
in  the  circuit  size.  In  addition,  some  experiments  are  reported  using 
symbolic  model  checking  to  compute  the  ecpiivalence  relation  between 
states  of  a  finite  state  machine.  Several  techniques  are  advanced  which 
improve  the  efficiency  of  this  computation  in  practice. 

The  SMV  .system.  .\  symbolic  model  checking  system  called  5.V/U  is 
presented.  This  system  permits  the  automatic  verification  of  programs 
written  in  a  specialized  language  for  describing  concurrent  finite  state 
systems  and  protocols.  This  language  is  somewhat  similar  to  LUS¬ 
TRE  [CHPP87]  in  its  synchronous  dataflow  semantics.  i)ut  has  several 
unique  aspects.  For  example,  it  allows  systems  to  be  modeled  non- 
determini.'itically  for  purposes  of  abstraction,  it  allows  arbitrary  inter¬ 
leaving  of  concurrent  processes,  and  it  allows  programs  to  be  annotated 
with  assertions  in  branching  time  temporal  logic. 

Formal  verification  of  the  Encore  Gigamax  cache  consistency  pro¬ 
tocol.  The  cache  consistency  protocol  of  a  distributed  shared-memory 
multiprocessor  called  the  Encore  (ligama.x  is  modeled  in  the  SMV  lan¬ 
guage  and  verifiefl  using  the  symbolic  model  checker.  Running  in  min¬ 
utes.  the  symbolic  model  checker  discovered  errors  in  this  system  which 
were  not  discovered  by  simidation.  in  spite  of  the  v'ery  large  state  spac<' 
of  the  model  [MS91].  l  liis  experiment  shows  that  the  model  checkinu; 
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technique  can  be  used  effectively  in  an  industrial  setting  for  highly  com¬ 
plex  systems.  It  also  sheds  light  on  issues  involved  in  modeling  such 
protocols  as  finite  state  systems,  and  the  kinds  of  errors  that  can  be 
found  by  model  checking  that  are  not  likely  to  be  found  by  simulation. 

Induction  over  processes.  A  partially  automated  method  of  induc¬ 
tion  is  described  for  proving  properties  of  parameterized  classes  of  de¬ 
signs.  The  method  applies  to  a  variety  of  process  models,  requiring  of 
the  model  oidy  certain  simple  algebraic  properties.  The  SM\’  system 
is  extended  to  support  proof  by  induction,  allowing  some  properties  of 
the  Gigamax  cache  protocol  to  be  verified  for  configurations  of  arbitrary 
size. 

Verification  usiiifi  occurrence  nets.  .\n  alternative  method  for  avoid¬ 
ing  the  state  explosion  is  examinetl.  This  technique  avoids  considering 
all  of  the  possible  interleavings  of  concurrent  actions  by  using  a  partially 
ordered  representation  of  behavior  called  an  occurrence  net  [.\PVV81]. 
This  method  is  usetl  to  verify  that  a  design  for  an  asynchronous  dis¬ 
tributed  mutual  exclusion  circuit  is  ha.zard  free  (this  example  is  also 
used  for  the  symbolic  model  checking  method).  Using  this  technique, 
we  also  find  empirically  that  the  run  time  is  polynomial  in  the  number 
of  components  of  the  system,  while  the  number  of  states  is  exponential. 


1.3  Related  research 

Since  the  state  explosion  problem  is  ul)i(|uitous  in  the  verihcation  of 
(•om|)uter  systems  and  j)rotocols.  many  researchers  in  the  area  have 
studied  it. 


1.3.1  Reduction 

The  most  common  approach  is  based  oti  reduction  ~  reducing  the  cor¬ 
rectness  prol)lem  to  a  similar  problem  in  a  smaller  state  space.  This  is 
generally  done  by  replacing  processes  in  the  model  by  smaller  processes 
that  have  similar  or  identical  communication  behavior.  I  he  most  gen¬ 
eral  framework  for  this  kind  of  reduction  is  that  of  Kiirshan  iKur87]. 
Using  hoinoinorpliic  nductions  of  wo--automatoti  tiuxlels.  it  is  |)ossible 
to  siinplil\'  not  onlv  tin'  internal  stat<M)fa  proc('ss.  i)iit  also  its  I'xternal 
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communications.  In  this  methodology,  one  generally  builds  a  hierarchy 
of  reductions,  in  which  processes  at  the  lowest  level  are  reduced,  then 
combined  at  the  next  higher  level  and  further  reduced,  etc.  Kurshan 
advocates  building  this  hierarchy  from  the  top  down,  so  that  the  most 
abstract  models  can  be  verified  before  details  are  filled  in  at  the  next 
lower  level. 

.V  hierarchical  approach  was  also  taken  by  Dill  in  his  trace  theoretic 
system  for  speed  independent  circuits  [Dil88].  In  this  case,  the  reduc¬ 
tion  is  obtained  mostly  by  hiding  internal  signals  of  a  module.  There  is 
no  provision  for  abstracting  tlie  signals  by  which  the  module  commu¬ 
nicates  with  its  environment.  That  is.  communication  always  remains 
at  the  same  level,  that  of  digital  signal  transitions. 

The  reduction  approach  is  generally  not  automatic.  Usually,  the 
reduced  process  is  obtained  in  an  ad  hoc  manner,  and  the  validity  of 
the  reduction  is  then  tested  automatically.  Some  methods  have  been 
proposed  for  obtaining  reduced  processes  automatically,  however.  For 
example,  in  a  method  called  compositional  model  checking,  a  state  min¬ 
imization  procedure  is  used  to  obtain  a  reduced  process  that  is  equiv¬ 
alent  to  the  original  process  with  respect  to  observation  via  its  inputs 
and  outputs  [CLM89b.  CLM89a].  This  reduction  preserves  the  truth 
value  of  all  formulas  in  a  suitable  logic.  Graf  and  Steffen  have  also  stud¬ 
ied  minimization  with  respect  to  a  suitable  notion  of  equivalence  as  a 
reduction  technique  [GS91].  Minimization  techniques  are  fairly  strict 
in  terms  of  the  recjuired  relation  between  the  original  and  reduced  pro¬ 
cesses.  however.  .\s  a  result,  the  reduction  that  can  be  obtained  using 
these  techniques  is  not  generally  as  great  can  be  obtained  using  more 
flexible  but  unautornatefl  methorls. 

The  symbolic  model  checking  technique  is  not  really  an  alternative 
to  reduction  methods,  but  is  complementary  to  them.  In  general,  the 
larger  the  state  space  that  can  lie  searched  automatically,  the  less  the 
need  for  reduction.  For  example.  Dill  used  a  reduction  (constructed  by 
liand)  to  verify  a  speed  independent  distributed  mutual  exclusion  ring 
circuit  [Dil88].  Using  symbolic  model  checking,  there  is  no  need  for  a 
reduction  -  the  veriHcation  time  is  polynomial  in  the  size  of  the  ring  (cf. 
chapter  2).  On  the  other  hand,  symbolic  model  checking  techniques  can 
be  used  to  implement  th<'  validity  test  for  re<luctions  (cf.  chapter  ')!. 
hence  the  two  technif|ues  can  be  combined. 
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1.3.2  Induction 

In  systems  of  many  identical  processes,  it  is  sometimes  possible  to  re¬ 
duce  an  arbitrary  number  of  processes  to  a  single  process  while  retain¬ 
ing  certain  properties  of  interest.  For  example,  Browne.  Clarke  and 
Grumberg  proposed  a  reduction  technique  of  this  sort  which  preserves 
the  truth  value  of  formulas  in  a  suitably  restricted  logic  with  process 
quantifiers  [BCG86].  Unfortunately,  the  reduction,  a  form  of  bisimula¬ 
tion,  had  to  be  established  by  hand.  There  was  no  automated  way  of 
checking  it.  Kurshan  and  McMillan  proposed  an  inductive  method  of 
establishing  the  reduction  that  could  be  checked  automatically  [KM89]. 
The  method  is  also  less  restrictive  in  terms  of  the  properties  that  can 
be  proved  since  it  does  not  rely  on  bisimulation.  This  method  is  used 
in  chapter  5.  A  similar  method  was  described  independently  by  VVolper 
and  Lovinfosse  [VVLSO].  .Another  inductive  technique  has  been  de¬ 
scribed  by  Shtadler  and  Grumberg  [SG89].  This  technique  is  somewhat 
more  flexible  in  that  it  treats  networks  generated  by  context  free  gram¬ 
mars.  but  is  limited  to  bisimulation  as  a  reduction  technique.  .A  more 
detailed  comparison  of  these  methods  can  be  found  in  chapter  5. 

1.3.3  Other  symbolic  methods 

Coudert  and  Madre  have  described  a  method  for  verifying  finite  state 
machines  using  Ordered  Binary  Decision  Diagrams  which  is  similar  to 
symbolic  model  checking  [CBM89].  The  symbolic  model  checking  tech¬ 
nique  was  developed  in  1987.  The  technique  of  Coudert  and  Madre 
appears  to  have  been  developed  two  years  later  [Cou9l)  but  indepen¬ 
dently.  There  are  several  differences  of  approach  l)etween  the  two 
methods.  Symbolic  model  checking  is  directed  mostly  toward  prov¬ 
ing  temporal  properties  of  finite  state  systems,  whereas  Coudert  and 
Madre  have  concentrated  mostly  on  proving  equivalence  of  determinis¬ 
tic  .Mealy  machines  (though  they  also  discuss  temporal  logic  [CMB9i]). 
Testing  .Mealy  machine  equivalence  is  useful,  for  example,  when  one  is 
mapping  a  tlesign  from  one  technology  to  another,  but  is  a  fairly  lim¬ 
ited  form  of  verification,  since  the  specification  is  at  the  same  level 
of  detail  as  the  implementation.  .Also,  in  this  work,  we  consider  the 
performance  of  algorithms  mostly  in  terms  of  asymptotic  behavior  for 
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regularly  structured  classes  ot  systems,  while  Coudert  and  Madre  liave 
considered  mostly  a  set  ot  benchmark  circuits  for  synthesis.  This  makes 
it  difficult  to  determine  how  well  their  technique  .scales  with  circuit  size. 
Finally,  Coudert  and  .\ladre  have  not  as  yet  reported  any  results  for 
testing  equivalence  of  two  different  implementations  of  the  same  finite 
state  machine.  In  practice,  they  liave  only  used  symbolic  techniques  to 
generate  the  set  of  reachable  states  of  a  finite  state  machine.  This  infor¬ 
mation  is  useful  for  test  generation  and  sequential  synthesis  [T.SL''‘00j. 
but  these  e.xperiments  |)rovide  no  information  about  how  well  the  tech¬ 
nique  works  for  verification.  On  the  other  hand,  the  symbolic  model 
checking  technique  has  been  applied  to  the  verification  of  an  industrial 
design  for  a  distributed  cache  consistency  protocol  (cf.  chapter  1).  .\ 
more  detailed  description  of  the  work  of  Coudert  and  Madre.  and  oth¬ 
ers  using  OBDDs  for  se(juential  circuit  verification,  can  be  found  in 
chapter  2. 
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Chapter  2 

Symbolic  model  checking 


As  mentioned  in  the  introduction,  a  formal  verification  system  has  sev¬ 
eral  basic  elements.  First,  we  require  a  model.  .\  model  is  an  imaginary 
universe,  or  more  generally,  a  class  of  possible  imaginary  universes.  To 
make  our  model  meaningful,  we  require  a  theory  that  predicts  some 
or  all  of  the  possible  observations  that  might  be  made  of  the  model. 
An  observation  generally  takes  the  form  of  the  truth  or  falsehood  of  a 
predicate,  or  statement  about  the  model.  Finally,  to  verify  something 
meaningful  about  the  model,  we  require  a  methodology  for  proving  state¬ 
ments  that  are  true  in  the  theory. 

In  program  proving,  the  universe  is  a  totally  imaginary  one.  driven 
by  mechanisms  (the  compiler  and  hardware)  of  which  the  programmer 
has  no  knowledge.  The  logician  is  free  to  assign  any  semantics  at  all 
to  programs,  provided  the  compiler  writer  and  hardware  designer  agree 
to  implement  them.  This  makes  program  proving  an  artificial  science, 
in  the  sense  that  our  theory  is  true  because  we  say  it  is.  In  contrast,  a 
hardware  verification  system  requires  a  model  of  a  real  physical  system. 
The  underlying  physical  mechanism  is  still  invisible  to  us  (we  ran  only 
postulate  its  existence),  but  we  can  empirically  construct  a  model  which 
predicts  the  necessary  observations  with  a  sufficient  degree  of  accuracy 
for  our  purposes  (the  verification  of  digital  circuits).  It  turns  out  that 
the  required  rlegree  of  accuracy  is  not  very  large.  Though  (|uite  accurate 
models  are  possible,  using  partial  differential  equations  to  describe  the 
time  evolution  of  fields  and  particle  densities,  a  suitable  design  style 
makes  it  possible  to  consider  only  the  digital  (one  or  zero)  value  of 
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voltages,  ignoiing  entirely  the  exact  voltage  within  the  digital  ranges, 
and  the  time  it  takes  to  switch  from  one  range  to  another.  Depending  on 
the  design  style  (eg.,  synchronous  or  self  timed),  different  models  may 
l)e  appropriate.  In  certain  rare  cases,  we  may  have  to  use  differential 
equations  to  model  the  analog  behavior  of  circuits  (for  example,  when 
metastability  arises).  In  this  thesis,  though,  we  will  consider  only  fairly 
abstract  models  of  circuits  cis  finite  state  machines.  Thus,  we  return  to 
the  science  of  the  artificial,  wherein  we  choose  the  theory  to  suit  our 
needs,  but  with  the  understanding  that  a  method  exists  for  translating 
our  models  into  real  systems. 

The  kind  of  theory  that  emerges  for  the  model  depends  to  a  large  ex¬ 
tent  on  the  kind  of  experiments  the  observer  is  able  to  perform.  For  ex¬ 
ample.  in  traditional  program  proving  systems,  the  observer  is  allowed 
to  set  up  the  initial  state  of  the  program,  wait  for  the  program  to  ter¬ 
minate.  and  then  examine  the  final  state.  The  theory  of  this  model  can 
be  expressed  in  a  kind  of  before-and-after  logic  whose  axioms  determine 
the  semantics  of  programs.  For  example,  in  Floyd-Hoare  logic  [Hoa69], 
the  formula 

{true}  X  :=  y  {x  —  y) 

is  an  axiom:  for  any  initial  condition,  after  the  program  x  :=  y  termi¬ 
nates.  .V  and  y  have  the  same  value.  The  fact  that  no  other  variables 
change  value  in  the  process  can  also  be  expressed  as  an  axiom: 

{'  =  «}  :=  y  {-  =  «} 

provided  neither  r  nor  a  depend  on  x. 

In  this  system,  if  the  j)rogram  fails  to  terminate  (iliverges).  the 
observer  must  simply  wait  forever,  it.,  no  observation  is  po.ssible.  One 
might  ask  whether  waiting  forever  is  not  itself  an  observation,  that 
is.  should  it  not  l)e  possible  to  state  in  the  semantics  that  a  given 
program  terminates  or  doesn  t  terminate  for  a  given  initial  condition.^ 
This  point  can  be  argued  either  way  for  programs  (since  knowing  that 
a  program  terminates  before  infinity  is  not  very  practical  information). 
However,  for  digital  systems  (or  reactive  systems  in  general),  it  is  clear 
that  simple  before  and  after  conditions  are  not  a  sufficient  theory;  first 
of  all  termination  for  these  systems  is  not  well  ilefined.  and  moreover 
the  meaning  of  wliat  these  systems  are  siipposeil  to  do  is  inseparably 


2.1.  TEMPORAL  LOGIC 


25 


linked  with  the  evolution  of  events  in  time  [Pnu77].*  What  we  need  is 
a  formal  theory  in  which  we  can  reason  about  temporal  aspects  of  a 
system's  behavior. 


2.1  Temporal  logic 

Temporal  logic  (or  tense  logic)  is  a  system  devised  by  philosophers  ex¬ 
pressly  for  making  statements  about  changes  in  time  [Bur84].  In  tem¬ 
poral  logic,  the  formula  Fq  is  true  in  the  present  if  q  is  trtie  at  some 
moment  in  the  future.  Similarly  Pq  is  true  in  the  present  if  q  is  true 
at  some  moment  in  the  past.  These  tense  operators,  F  and  P,  have 
duals  which  are  generally  given  their  own  names.  The  formula  Gq  is 
equivalent  to  ^F-'q,  meaning  that  q  is  true  at  every  moment  in  the 
future.  The  formula  Hq  is  ecpiivalent  to  -'P-'q,  meaning  that  q  is  true 
at  every  moment  in  the  past.  These  operators  can  give  surprisingly 
concise  expressions  of  sentences  with  complex  tense  structures.  For  ex¬ 
ample.  q  =»  F Pq  can  be  interpreted  as  ’‘if  q  holds  in  the  present,  then 
at  some  time  in  the  future  q  will  have  held  in  the  past". 

The  usual  model  theoretic  semantics  given  to  temporal  logic  (and 
other  modal  logics)  is  the  so-called  possible  wor/ds  semantics.  A  frame  in 
this  semantics  consists  of  a  class  S  of  states  through  which  the  system 
evolves,  and  a  relation  <  representing  temporal  order.  .A  model  is  a 
frame  with  a  valuation  L.  which  assigns  truth  or  falsehood  to  every 
atomic  proposition  (propositional  letter)  in  every  state."  The  truth  or 
falsehood  of  temporal  formulas  is  relative  to  the  present  state.  For 
example,  the  formula  Fq  is  true  in  state  .s  iff  there  exists  a  state  t  such 
that  p  is  true  in  state  t  and  .s  <  t.  Similarly.  Pq  is  true  in  state  .s  iff 
there  exists  a  state  /  such  that  p  is  true  in  state  /  iind  t  <  .s.  .Notice  that 
a  temporal  formula  acts  like  an  open  sentence,  with  one  free  parameter 
representing  the  present  state.  Thus  it  defines  a  class  of  states  in  which 
the  formula  is  true.  Similarly,  a  state  defines  a  c!a.ss  of  formulas  which 
are  true  in  that  state. 

‘The  question  of  termination  is  in  any  event  not  un<leci<Jable  for  hardware  sy.s- 
terns.  since  they  are  not  computation  universal  (only  programs  are), 

-’These  are  usually  called  Kripke  frames  and  Kripke  models,  after  one  of  the  first 
mathematicians  to  give  a  model  theoretic  interpret.ation  of  motlal  logic. 
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The  choice  of  axioms  in  the  logic  can  be  used  to  characterize  the 
temporal  ordering  relation  <.  For  example,  the  following  axioms  (in 
addition  to  the  propositional  tautologies)  exactly  characterize  those 
frames  whose  <  relation  is  a  partial  order  (transitive  and  antisymmet¬ 
ric)  [Bur84]: 

C{p  =>  <l)  =>  (Gp  =>  Gq)  (2.1) 

Hip  q)  =>  (Hp  ^  Hq)  (2.2) 

p  ^  GPp  (2.3) 

p  ^  HFp  (2.4) 

One  inference  rule  (in  addition  to  modus  ponens)  is  required:  by  tem¬ 
poral  generalization,  if  o  is  provable,  we  infer  that  Ga  and  Ha  (that 
is.  a  tautology  must  hold  true  at  all  times,  or  perhaps,  the  rules  of 
sound  inference  do  not  change  with  time).  By  specializing  this  system 
slightly,  we  can  obtain  logics  characterizing  a  variety  of  models  of  time, 
including  linear  time,  discrete  time,  and  branching  (non-deterministic) 
time.  .All  of  these  results  can  be  found  in  [Bur84]. 

2.1.1  Linear  time 

VVe  usually  think  of  time  as  a  linearly  ordered  set.  measuring  it  either 
with  the  real  numbers  or  the  natural  numbers.  A  frame  is  linearly 
ordered  if.  in  addition  to  being  [)cirtialiy  ordered,  it  is  total,  ie..  for  all 
states  s.t.  either  .s  <'  t.  s  =  f.  or  /  <  .s.  The  temporal  frames  in  which 
<  is  a  linear  order  can  be  characterized  by  simply  adding  the  following 
two  axioms  to  the  basic  set  (they  are  time  reversal  duals): 

(  FPq]  =>  (  F<1  V  q  V  Fij)  ( 2.5) 

iP  Fq)  =t>  {  Pq  y  q  V  F q)  (2.6) 

Linear  temporal  logic  is  usually  extended  by  the  until  operator  and 
the  sincf  operator.  Informally,  pi  <i  states  that  p  will  hold  at  some 
moment  in  t  iie  future,  until  which  time  q  will  hold  at  all  moments. 
Similarly,  p  S  q  states  that  p  held  at  some  moment  in  the  past,  since 
which  time  q  has  held  at  all  moments.  .More  precisely,  p  /  7  is  true  in 
state  .s  if  there  is  some  state  /  such  that  .s  <  t  and  q  is  true  in  state  /. 
and  for  all  .■-<//</.  p  is  true  in  stale  a. 
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.2  Discrete  time 

It  is  common  in  engineering  to  model  time  as  a  discrete  sequence  (mea¬ 
sured  by  the  integers).  Discrete  dynamics  are  commonly  used,  for  ex¬ 
ample,  in  signal  processing  and  synchronous  digital  systems.  .A  discrete 
frame  is  one  in  which  every  state  has  an  immediate  successor  and  an 
immediate  predecessor.  The  linear  discrete  frames  can  be  characterized 
by  adding  the  following  two  axioms  to  those  for  linear  time  logic: 


p  A  Hp  ^  FHp 

(2.7) 

p  A  Gp  =>  PGp 

(2.8) 

It  is  useful  in  a  discrete  linear  temporal  logic  to  define  a  next  tune 
temporal  operator.  The  formula  .Xq  is  true  in  state  .s  when  there  is  an 
immediate  successor  of  .v  in  which  r/  is  true.  .\  state  t  is  an  immediate 
successor  of  s  if  .•>  <  t  and  there  does  not  exist  a  state  u  such  that 
s  <  u  <  t.  Thus.  .Xq  is  exactly  equivalent  to  false  U  q,  so  its  addition 
does  not  increase  the  expre.ssiveness  of  the  logic. 

2.1.3  Branching  time 

.\  branching  frame  is  one  in  which  the  temporal  order  <  defines  a 
tree  which  branches  toward  the  future.  Thus,  every  instant  has  a 
unique  past,  but  an  indeterminate  future.  This  is  an  inherently  non- 
deterministic  model  of  time,  and  hence  is  well  suited,  for  example,  for 
defining  a  semantics  of  non-deterministic  programs.  A  frame  is  tree 
ordered  when  for  all  states  .s./.  n.  if  t  <  .s  and  it  <  .s  then  t  <  a.  I  =  n 
or  /  >  It.  In  other  words,  the  [last  of  ev<*ry  stale  is  linearly  ordered. 
The  tree  ordered  frames  can  be  characterized  by  simply  dropping  (2.()) 
from  the  axioms  of  linear  time  logic. 

Though  pure  tense  logic  can  exactly  characterize  the  branching  time 
frames,  it  leaves  something  to  be  desired  in  expressing  properties  of 
non-deterministic  programs,  for  example,  it  is  common  in  defining  the 
semantics  of  these  [irogranis  to  say  that  a  program  aborts  ilF  it  must 
inevitably  abort,  rids  functionality  can  be  implemented  by  backtrack¬ 
ing.  Similarly,  a  non-deti'rministic  Turing  machine  terminates  if  it  may 
possibly  terrninati'.  These  notions  of  inevitability  and  possibility  are 
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not  represented  in  an  ordinary  tense  logic.  They  can  be  incorporated, 
however,  by  combining  notions  from  temporal  logic  and  modal  logic. 

We  would  like  to  interpret  the  branching  structure  of  time  as  mean¬ 
ing  that  each  instant  of  time  has  many  possible  futures,  and  that  as  time 
evolves  from  present  to  future,  these  possibilities  are  reduced.  Thus, 
in  the  past,  there  e.xisted  possible  futures  which  are  now  precluded. 
This  interpretation  gives  rise  to  notions  of  necessity  (inevitability)  and 
possibility  in  tense  logic  [Tho84].  We  think  of  the  truth  or  falsehood  of 
tense  formulas  as  being  relative  to  a  given  branch  of  the  tree  ordered 
frame  (one  possible  evolution  of  time  into  the  future).  .\  branch  is  de¬ 
fined  as  a  ma.ximal  linearly  ordered  set  of  states.  W’e  will  write  r/[s.6] 
if  ff  holds  in  state  .s  in  branch  b.  Thus.  iff  there  exists  a  state 

Mn  6  such  that  .s  <  t  and  q[t,b].  Similarly.  P</[s.6]  iff  there  exists  a 
state  t  in  b  such  that  t.  <  s  and  q[t.b].  The  notion  that  q  is  necessarily 
true  is  represented  by  the  formula  Aq.  We  will  say  .4r/[.s.()]  iff  for  all 
branches  6'  containing  s.  </[s.fe'].  The  notion  that  q  is  possibly  true  is 
represented  by  the  formula  Eq.  VVe  will  say  £9(5,  (>]  iff  for  some  branch 
b'  containing  s.  q[s,b'].  Notice  that  ,4  and  E  provide  a  kind  of  second 
order  quantification  over  maximal  linearly  ordered  subsets.^ 

According  to  this  semantics  for  modal  branching  time  logic,  there 
may  be  possibilities  in  the  past  that  are  foreclosed  in  the  present.  For 
example,  q  =>  HAFq  is  not  valid.  The  fact  of  q  in  the  present  does 
not  imply  the  necessity  of  q  in  the  past.  Thus,  modal  branching  time 
logic  might  be  termed  the  logic  of  regret.  The  logic  can  also  express 
useful  semantic  properties  of  non-deterministic  programs  [BA.\1P81]. 
For  example,  if  q  represents  the  fact  of  a  program  terminating,  then 
inevitable  termination  is  e.xpressed  by  the  formula  AFq  (necessarily  in 
the  future  //).  Possible  termination  is  expressed  by  EFq  (possibly  in  the 
future  (i).  If  the  proposition  p  represents  a  correct  output  of  the  pro¬ 
gram.  then  (inevitable)  partial  correctness  is  expres.sed  by  the  formula 
.4G'(f/  p)  (necessarily  invariantly.  termination  implies  correctness). 
The  somewhat  odd  but  definable  notion  of  possible  partial  correctness 
is  expressed  by  EGiq  =>  p).  Note  that  Pq.  APq  and  EPq  are  all  log¬ 
ically  e(|iiivalent.  since  the  past  of  a  state  is  the  same  for  any  branch. 


■'< 'lassically.  t  he  symbol  □  is  used  to  represent  necessity,  and  0  is  u.sed  to  repre- 
■sent  po.ssihility.  I  he  symbols  .1  and  E  are  n.sed  here  for  consistency  with  [U.V.VIPfdj. 
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Also  note  that  A  and  E  are  dual,  since  .Aq  is  equivalent  to  ^E^q. 

2.2  The  temporal  logic  CTL 

The  temporal  logic  CTL  is  a  subset  of  modal  branching  time  logic  de¬ 
fined  by  Clarke  and  Emerson  [CE81b].  The  acronym  stands  for  Com¬ 
putation  Tree  Logic.'  In  ('TL.  temporal  operators  occur  only  in  pairs 
consisting  of  A  or  E.  followed  by  F.  G.  U  or  A'.  Thus,  past  time  oper¬ 
ators  are  not  allowed,  and  tense  operators  cannot  be  combined  directly 
with  the  propositional  connectives. 

2.2.1  Syntax  and  semantics  of  CTL 

The  syntax  of  CTL  formulas  is  given  as  follows: 

1.  Every  atomic  proposition  is  a  CTL  formula. 

2.  If  /  and  y  are  CTL  formulas,  then  so  are 

-^/.  {/Ay).  AA/,  £A/,  AifUy),  EifUy) 

The  remaining  operators  are  viewed  as  being  derived  from  these 
according  to  the  following  rules; 

J  y  g  =  A -y) 

AFfi  =  .IftrueC//) 

F.Fy  =  £(true  I  y) 

AG f  ~  -’£(true  C  ->/) 
fXi'f  =  -'Aftrue  C  -•/] 

The  truth  or  fal.sehood  of  formulas  is  defined  with  respect  to  a 
Kripke  model,  but  in  a  slightly  non-standard  way.  For  CTL.  the  model 
is  a  triple  {S.R.IA.  where  S'  is  the  set  of  states.  R  is  the  tran.-iitinn 
relation  and  L  is  the  valuation.  The  transition  relation  is  the  set  of 

h’TL  i.s  art.iially  a  Mihsct  of  a  more  genera)  temporal  logic  desrribeil  m  [( 'F,81ai, 
ulopting  the  syntax  of  [BAMPSlj. 
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all  pairs  {s.t)  such  that  t  is  an  immediate  successor  of  s.  A  branch¬ 
ing  model  {a.k.a.  computation  tree)  can  be  obtained  by  starting  at  a 
designated  state  s  and  unwinding  the  graph  (5,  R)  into  an  infinite  tree 
(provided  every  state  has  at  least  one  successor).  The  semantics  for 
CTL  given  below  is  equivalent  to  the  standard  semantics  with  respect 
to  this  infinite  tree.^ 

.A  path  of  a  model  K  =  {S,R,L)  is  an  infinite  sequence  of  states 
(.So,  Si,  S) . . .)  G  .9“  such  that  each  successive  pair  of  states  (.s,.  .s,+i)  is 
an  element  of  R.  Every  path  is  maximal  linearly  ordered  subset  of  the 
tree  structure  unwound  from  sq. 

The  notation  K.  a  \=  f  means  that  the  formula  /  is  true  in  state  s 
of  Kripke  model  K.  In  the  sequel,  where  the  model  is  unambiguous, 
we  will  write  simply  s  [=  /.  The  interpretation  of  a  CTL  formula  / 
with  respect  to  a  Kripke  model  K  is  given  below,  by  recursion  over  the 
structure  of  formulas; 

.s  (=  p  iff  L{s)(p),  where  p  is  an  atomic  proposition 
•sh-/  iff  3^/ 

\=  f  Ag  iff  s\=  f  and  s  ^  g 
•So  ^  AX f  iff  for  all  paths  (so<  3i, . . .).  si  ^  / 

So  \=  EX f  iff  for  some  path  (so,Si,. . .),  si  ^  / 

3o  h  A{f  U  g)  iff  for  all  paths  (sq.Si,...),  for  some  i. 

Si  1=  g  and 

for  all  j  <  i.  Sj  \=  f 

■s,)  p  E{ f  I  g)  iff  for  some  path  (.So-^i - ).  for  some  i. 

s,  \=  g  and 

for  all  j  <  i.  1=  / 

2.2.2  Fixed  point  characterization  of  CTL 

Emerson  and  (.'larke  [CESla]  showed  that  various  branching  time  prop¬ 
erties  of  programs  can  be  characterized  as  extremal  fixed  points  of  ap¬ 
propriate  continuous  functionals.  Later,  they  introduced  the  logic  CTL, 
and  showefl  that  its  operators  can  be  characterized  in  this  way  [CESlb]. 
This  characterization  led  to  an  efficient  algorithm  for  the  model  check¬ 
ing  problem  -  determining  whether  a  given  CTL  formula  is  satisfied  in 

’With  oiu-  additional  distinction;  in  (.'TL.  the  future  is  taken  to  include  the 
present  I  hiis.  ifp  holds  in  the  present,  tnen  so  does  Fp. 
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a  given  state  of  a  finite  Kripke  model. 

To  obtain  the  fi.xed  point  characterization,  we  will  identify  each 
CTL  formula  /  with  {a  j  .s  ^  /}.  the  set  of  states  in  which  the  formula 
is  true.  In  this  way.  for  example,  true  denotes  the  empty  set.  false 
denotes  5,  and  every  subset  of  S  represents  an  equivalence  class  of 
formulas.*^  Let  P{S)  be  the  set  of  subsets  of  5.  P(S)  forms  a  lattice 
under  union  and  intersection.  This  lattice  is  ordered  by  set  inclusion, 
where  P  C  Q  if  and  only  if  P  U  Q  =  Q.  A  functional  r[V']  is  a  formula 
with  one  uninterpreted  propositional  letter  Y.  This  defines  a  function 
'P{S)  — »■  P{S).  where  r(P)  is  obtained  by  taking  P  for  V’  in  r.  By 
definition: 

1.  r  is  monotonic  when  P  QQ  implies  r(P)  C  t{Q). 

2.  r  is  U-continuotis  when  Pi  C  P2  C  •  •  •  implies  r(U,P,)  =  U,r(P). 

3.  r  is  D-continuous  when  Pi  2  ^2  2  •  •  •  implies  r(nip)  =  n,r(P,). 

When  the  set  S  is  finite,  every  increasing  chain  of  subsets  has  a  maxi¬ 
mum  element,  and  every  decreasing  chain  has  a  minimum  element.  .\s 
a  result,  in  the  finite  case,  monotonicity  implies  both  U-continuity  and 
fl-continuity. 

A  fixed  point  of  r  is  any  P  such  that  t(P)  =  P.  Tarski  [Taroo] 
showed  that  a  monotonic  functional  always  has  a  least  and  a  greatest 
fixed  point  with  respect  to  inclusion  ordering: 

Theorem  1  (Tarski-Knaster)  Whenever  t[V  ]  is  monotonic.  it  hn.‘< 
a  least  fixed  point,  denoted  /<V.r[y  ]  and  a  greatest  fixed  point,  denoted 
fVhe.n  r[}  ]  /.>  also  {J-rontiuuou.s.  //}•. r[y  j  =  U,>o r‘( false). 
When  r[y'']  is  also  C\-continiious.  i/y’.r[y’]  =  n,>()r‘(true). 

We  can  now  charact<'rize  the  (.'TL  operators  in  terms  of  fixed  points 
of  appropriate  functionals: 

Theorem  2  (Clarke-Emerson)  Provided  S  is  finite. 

’’This  is  t'ssentially  an  algebraic  interpretation  of  logic,  where  we  embed  the 
formulas  of  the  logic  in  a  Boolean  algebra  (P(.b').O.  l.n.U.— ).  with  O  represent¬ 
ing  conjunction.  U  representing  disjunction  and  —  (set  complement)  representing 
negation 
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1.  EFp  =  pYip  V  EXY) 

2.  EGp  =  uY.ipA  EXY) 

■I  E{q  V  p)  =  pY.ipV  {qA  EXY)) 

There  is  a  standard  algorithm  for  computing  the  least  [greatest] 
fixed  point  of  a  monotonic  functional.  This  is  done  by  starting  with 
false  [true]  and  iterating  the  functional  until  a  fixed  point  is  reached, 
as  shown  l)elow.  Assuming  5  is  finite,  this  procedure  terminates  in  at 
most  |.s'|  +  I  iterations  with  the  least  [greatest]  fixed  point  of  r[V']; 

to  compute  /tV'.r[V  ]  {or  i/V’.rfV']}  : 
let  V  =  false:  {or  Y  =  true} 
do 

let  V'  =  V.  V  =:  r[y] 
until  Y'  =  V 
return  V 

Theorem  3  Given  a  finite  set  S.  and  a  monotonic  functional  t\Y], 
the  standard  fixed  point  algorithm  computes  /iy’.r[y']  {or  j/y.r]}']}  in 
at  most  |5|  +  1  iterations. 

Proof.  Since  r  is  monotonic.  r‘^[false]  C  [false]  C  r‘ [false]  ••  •. 
The  longest  strictly  increasing  chain  of  subsets  of  S'  has  length  |S'|  +  1. 
Hence,  there  must  be  an  i  such  that  0  <  /  <  |.S|  and  -‘[false]  = 
[false]  (otherwise  there  would  be  a  strictly  increasing  chain  of  length 
jS'l  +  2).  Heace.  the  algorithm  terminates  after  at  most  |,s'|  +  1  itera¬ 
tions.  For  anv  such  /.  U,>or-' [false]  =  -‘[false].  Hence,  by  theorem  1. 
/£>>[>']  =  -‘[false]. 

For  the  greatest  fi.xed  point,  substitute  true  for  false.  D  for  C.  and 
decreasing  for  increasing  in  the  above  argument.  □ 

Having  a  fixed  point  characterization  of  the  ( 'TL  operators  allows  us 
to  use  the  stanclard  fixed  point  algorithm  to  determine  the  set  of  states 
of  a  given  model  in  which  a  CTL  formula  is  true.  .\s  an  example, 
consider  coitiputing  EFp  in  the  following  Kripke  model:' 

■  \Vf  rt'presetu  a  Kripke  moilel  picionally  by  <lrawinii,  the  !>rapli  {S.R)  aiul  la- 
l)elini>,  1^1(11  state  with  the  atomic  propositions  which  are  true  in  ilial  state 
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Since  |5j  =  4.  the  number  of  iterations  recjuirecl  to  produce  the  fixed 
point  is  at  most  4.  Therefore,  let  us  compute  r‘[false]  for  i  =  1  ...4. 
where  r[V']  =  p  V  EXV.  .\fter  the  first  iteration,  we  have  r^[false]  = 
p  V  f'.Y  false  =  p: 


.\fter  the  second  iteration,  we  have  r^[faise|  =  p  V  EXp: 


.After  the  third  iteration,  we  have  r^[falsej  =  pV  EX{py  EXp)'. 


which  is  a  fixed  point,  since  the  next  iteration.  r‘^[false]  produces  the 
same  result.  .Notice  that  at  each  iteration  i,  we  have  the  set  of  states  .so 
such  that  there  exists  a  path  (.So,Si..S2. . . .)  where  p  is  true  at  some  state 
less  than  /.  This  algorithm  can  be  thought  of  as  a  backward  breadth 
first  search  of  the  graph.  In  the  end.  we  have  labeled  exactly  the  set  of 
states  on  a  path  to  a  state  labeled  with  p. 


•As  a  second  example,  consider  computing  EGp  in  the  tollowing 
Kripke  model: 


Aft(’r  the  first  iteration,  we  have  r'ftruej  =  p  A  /f.Vtrue  =  p\ 
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After  the  second  iteration,  we  have  r"[true]  =  p  /\  EXp: 


After  the  third  iteration,  we  have  r'^[triie]  =  p  A  EX(pA  EXp): 


This  is  the  greatest  fi.xed  point,  since  the  next  iteration  r*[true]  pro¬ 
duces  the  same  result.  .Notice  that  at  iteration  i.  we  have  the  set  of 
states  such  that  there  exists  a  path  of  length  i  where  every  state  satisfies 
p.  When  we  reach  a  fixed  point,  every  state  in  the  set  has  a  successor 
in  the  set  satisfying  p,  hence  for  every  state  in  the  set.  there  exists  an 
infinite  path  where  p  is  always  true. 

The  operators  EX,  E{  C  )  and  EG  are  actually  sufficient  to  char¬ 
acterize  the  entire  logic,  since  the  remaining  operators  ran  be  derived 
from  these  three  according  to  the  following  rules; 

EFp  =  /i,’(true  C  p) 

\Xp  =  -'EX-'p 

\(lp  =  ^EE-p 

.!((/  r  p)  =  ^[E(^p  C  -^(1  A  ^p)  '/  EG~p] 

For  this  reason,  in  the  se(|uel.  we  will  consider  only  the  operators 
EX.  El  I  I  and  EG.  However,  for  <'ompleteness.  here  are  the  fi.xed 
l)oint  characterizations  of  the  remaining  operators: 

.  1( ip  =  (/V  .( p  A  .  l.\  )  ) 

Alq  (  p)  —  /t  V./»  V  (// A  .  LV  Vi 
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The  fixed  point  characterization  provides  an  effective  algorithm  for 
the  model  checking  problem.  In  fact,  a  more  efficient  algorithm  exists, 
based  on  breadth  first  search  and  the  calculation  of  strongly  connected 
components  in  the  graph  (S.R)  [CES86].  Both  of  these  algorithms 
suffer  from  the  state  explosion  problem,  however;  it  is  necessary  to 
construct  the  complete  state  graph  of  the  system  being  modeled  before 
model  checking  can  be  applied.  Since  the  number  of  states  of  a  system 
grows  exponentially  in  the  number  of  its  components,  these  algorithms 
can  only  be  applied  to  small  systems. 


2.3  Symbolic  CTL  model  checking 

In  the  previous  section,  we  ecpiated  a  CTL  formula  with  the  set  of  states 
in  which  the  formula  is  true.  VVe  showed  how  the  CTL  operators  can 
thus  be  characterized  as  fixed  points  of  certain  monotonic  functionals 
in  the  lattice  of  subsets,  and  how  these  fixed  points  can  be  computed 
iteratively.  In  this  section,  we  equate  sets  and  relations  with  Boolean 
formulas,  and  show  how  set  theoretic  operations  such  as  union-,  inter¬ 
section  and  image  can  be  characterized  in  terms  of  Boolean  operations. 
This  allows  the  CTL  model  checking  algorithm  to  be  implemented  using 
well  developed  automatic  techniques  for  manipulating  Boolean  formu¬ 
las.  Since  the  state  graph  is  symbolically  represented  by  a  Boolean 
formula,  there  is  no  need  to  actually  construct  it  as  an  explicit  data 
structure.  Hence,  the  slate  explosion  problem  can  be  avoided. 

2.3.1  Quantified  Boolean  formulas 

Quantified  Boolean  Furnudas  (QBF)  are  an  extension  of  |)ropositional 
logic  allowing  quantifiers  over  propositional  variables.  Cliven  a  set  V'  of 
propositional  variables.  (JBFfV  )  is  the  least  set  of  formulas  such  that 

1.  true  and  false  are  formulas. 

2.  every  variable  in  V  is  a  formula. 

•3.  if  p  and  r/  are  forttiulas.  then  so  are  /»  V  q  and  -'p.  and 
1.  il  p  is  a  formula  and  r  is  in  \  .  then  3r.  p  is  a  formula. 
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A  truth  assignment  is  a  function  V'  — +  {false,  true}.  VVe  equate  each 
QBF  formula  with  the  set  of  truth  assignments  that  satisfy  the  for¬ 
mula.  Thus,  true  represents  the  set  of  all  truth  assignments,  false  the 
empty  set.  and  a  propositional  variable  v  represents  the  set  of  all  truth 
assignments  a  such  that  a(v)  =  true.  In  addition. 

1.  a  t  (p  V  (/)  if  and  only  if  a  6  p  or  a  6  q, 

2.  (I  G  { -’[))  if  and  only  if  «  ^  p,  and 

3.  (I  G  ( 3i’.  p)  iff  a{  V  *—  true)  6  p  or  «( u  <—  false)  €  p. 

It  is  useful  to  define  an  operator  for  QBF  that  substitutes  a  formula 

for  a  variable.  If  p  and  q  are  QBF  formulas,  and  v  is  a  variable,  then 
let  a  G  p((’  <—  q)  if  and  only  if  «(u  («  e  q))  g  p.  Note  that 

quantification  can  be  defined  in  terms  of  substitution,  since  3(.’.  p  = 
p{u  <—  false)  V  p(e  <—  true). 

Quantification  and  substitution  can  also  be  defined  for  vectors  of 
variables.  If  \V  =  (wi,...,Wk)  is  an  n-tuple  of  propositional  variables 
and  Q  =  (qi, _ qn)  an  n-tuple  of  formulas,  then  let 

1.  a  G  p  iff  for  some  b  :  W  —>■  B.  a(w,  «—  b(wi))  G  p  and 

2.  a  G  p(li'  •f-  Q)  iff  a(w,  *—  {a  G  qi))  G  p- 

2.3.2  Representing  sets  and  relations 

The  state  of  a  concurrent  .system  is  generally  modeled  as  vector  where 
eacli  com|)on<'nt  represents  the  stale  of  one  component  of  the  system. 
For  the  moment,  let  us  tnake  the  simplifying  assumption  that  all  of 
the  stale  components  are  Boolean  valued,  as  is  generally  the  case  in 
digital  systems.  state  of  the  system  can  therefore  be  viewed  as  a 
truth  assignment  to  a  set  of  propositional  variables  \  =  { ci.  r_> . c„ }. 

I  nder  this  ititerpretation.  every  QBF  formula  over  the  set  of  state 
variables  I  denotes  a  set  of  states,  te..  the  set  of  truth  assignments 
which  satisfy  the  formuha.  For  example,  if  we  have  two  state  \ariables 

II  and  I).  t  lu'ii  the  formula  a  V  h  represents  all  the  states  in  which  n  is 
t  rue  or  I)  is  t  rue. 
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In  order  to  represent  a  binary  relation  with  a  QBF  t’ormiila,  we 

introduce  two  ordered  sets  of  variables  V  ~  {ui, - 02}  and  V  — 

. . . .  The  set  I  represents  the  left  argument  of  the  relation, 
and  the  set  V  represents  the  right  argument.  By  this  arrangement,  a 
QBF  formula  R  over  the  variables  V  U  V  stands  for  a  binary  relation 
R'.  the  set  of  pairs  (  r.//)  in  (V  — >•  B)~  such  that 

(.V.II)  e  R'  iff  €  R  Ci.O) 

.\.s  an  example,  if  we  have  two  state  variables,  a  and  b.  then  the  QBF 
formula  a  A  b'  represents  all  ordered  pairs  of  states  such  that  «  is  true 
in  the  first  state,  and  h  is  true  in  the  second  state. 

I' sing  this  representation,  we  can  express  a  variety  of  standard  set 
theoretic  operations  in  terms  of  the  QBF  connectives.  For  example,  the 
union  of  two  sets  represented  Iw  .3  and  B  is  .-I  V  B.  their  intersection 
IS  A  /\  B  and  the  complement  of  .A  is  ->.4. 

The  image  R'(Q)  of  a  set  Q  via  a  binary  relation  R'  is  the  set  of 
all  y  such  that  for  some  x  €  Q,  (x.y)  €  R'.  If  /?  is  a  QBF  formula 
representing  a  relation  /?'.  and  Q  is  a  QBF  formula  representing  a  set. 
then 

R'iQ)  =  {3V.(RAQ)){V'^V)  (2.10) 

VVe  can  prove  this  by  simply  expanding  the  definitions  of  the  QBF 
operators,  as  follows: 


7  (3V.  {RAQ))(V'  ^  V) 

iff 

—  !l{r,))  €  3V.  {  R  AQ) 
iff 

exists  .1'  :  V  —  B  s.t.  //(e'  <—  <—  .c(e,))  ^  R  A  Q 

iff 

exists  .i  :  \  —*  B  s.t.  (.c./y)  t  R'  and  .r  €  Q 
iff 

n  e  R'iQ) 

.\s  an  example,  lei  (J  =  n  V  l>  and  R  =  //.  A  //.  Then 

R{Q)  = 


(  3V.  ((</  A  //)  A  ill.  V  />))(l  '  —  \  ) 
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=  (3V.  (a  A  b')}(V' ^  V) 

=  h'iV'^V) 

=  h 

The  inverse  image  R~^(Q)  of  a  set  Q  via  a  binary  relation  R  is  the 
set  of  all  r  such  that  for  some  ij  €  Q.  R{x,!j).  If  7?  is  a  QBF  formula 
representing  a  relation,  and  Q  is  a  QBF  formula  representing  a  set. 
then 

=  (2.11) 

This  can  be  shown  by  a  derivation  similar  to  the  one  above. 

2.3.3  CTL  formulas 

We  now  have  the  necessary  mechanics  to  represent  Kripke  structures 
using  QBF  formulas,  and  to  characterize  the  CTL  operators  over  these 
symbolically  represented  Kripke  structures  using  QBF  operators.  In 
fact,  it  is  only  necessary  to  characterize  the  CTL  operator  EX.  since  the 
logical  connectives  have  identical  meanings  in  both  logics,  and  the  re¬ 
maining  CTL  operators  have  already  been  characterized  as  fixed  points 
of  functionals  using  only  EX  and  the  logical  operators. 

To  represent  a  Kripke  structure  symbolically,  we  will  assume  two 
sets  of  variables  V  =  and  V  =  {t/,. . . . .  r'j}.  and  a  QBF 

formula  R  on  I'  U  V  '  to  represent  the  transition  relation.  This  induces 
a  Kripke  structure  Kv.v.r  =  (S\  R' .  L)  where 

1.  The  stale  set  .'s'  is  the  set  of  truth  assignments  \  —  B. 

2.  riu'  transition  relation  R'  is  the  relation  represent('(l  by  the  for¬ 
mula  R.  according  to  (2.!)). 

3.  The  valuation  L  yields  the  truth  value  of  each  variable  c,  in  each 
state  .s.  That  is.  for  all  e,  6  VT  Hs)(r,)  =  .s(e,). 

The  complete  pro<edure  for  symbolic  model  checking  is  characterized 
by  the  following  theorem: 

Theorem  4  Lt  t  V  =  {ci . c„ }  niul  V  '  =  {c', . c'  }  hr  disjoinl 

sc/.s  of  I’d iKiblr.s.  hi  R  he  a  QBE  fonnnld  on  \  U  I  and  Iff  Ev.v.ii  hr 


2.3.  SYMBOLIC  CTL  MODEL  CHECKING 


39 

the  induced  Kripke  model.  In  this  model,  for  all  CTL  formulas  p  and 


N  ijf  ^  P-  where  p  G  V  (2.12) 

s\=  py  q  iff  -'i  e  (p  V  q)  (2.13) 

\=  ~'P  iff  •^€(-'p)  (2.14) 

■s  f=  E.Xp  iff  G  {3V'.  (/?  A  p(V  ^  1'')))  (2.15) 

•S  j=  Eiq  r  p)  iff  e  pV.  (pv  {q  A  EXY)}  (2.16) 

s  [=  EGp  iff  s  G  oY.  (pA  EXY).  (-.17) 


Proof.  The  first  three  are  trivial  matters  of  definition.  For  (2.15). 
when  we  equate  a  formula  witii  the  set  of  states  satisfying  it.  EXp  is 
just  /?'~‘(p),  which  is  equal  to  31-''.  [RAp{V  *—  V')).  The  last  two  are 
just  theorem  2.  □ 

The  above  theorem  shows  that  we  can  solve  the  model  checking 
problem  -  ie.,  determining  whether  a  given  state  in  a  symbolically 
represented  Kripke  structure  Kv.v.r  satisfies  a  formula  /  -  purely  by 
manipulations  of  Boolean  formulas.  A  key  point  is  that  the  Kripke 
structure  itself  is  never  built.  Instead  it  is  symbolically  represented 
by  a  QBF  formula.  .As  an  example,  consider  a  system  with  one  state 
variable  h.  Let  the  transition  relation  be  represented  by  the  formula 
/?  =  hy  b'.  and  let  state  .“i  he  (6  ♦—  false).  The  induced  Kripke  structure 
K{i>\.{h’}.R  ii’  depicte<l  below: 


Let's  say  we  want  to  determine  whether  or  not  .s  1=  EX^h.  According 
to  theorem  4.  we  can  evaluate  the  formula  EX-'h  as  follows: 

=  3//.(  /?  A  (  ->/>)(/>  +—  //)) 

=  3//.((6V //)  A  (-//)) 

=  3h'.(l>  A  ->//) 

=  I) 


E  .\ 
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Hence,  by  theorem  4.  ^  EX-’b  iff  the  assignment  {b  <—  false)  satisfies 

b.  which  is  false. 

Now  consider  the  problem  of  whether  or  not  .s  )=  EFb.  Using  the 
standard  fixed  point  algorithm,  we  get 


[false] 
r"  [false] 

r  *[[alse] 


by  E.V false 

b 

by  EXb 

by  3b'. ((by  b')  A  b') 
true 

by  E.V  true 
true 


fixed  point  is  reached  after  two  iterations.  Hence,  .s  ]=  EEb  iff  the 
truth  assignment  (b  *—  false)  satisfies  true,  which  is  true. 

.Note  that  when  computing  least  (or  greatest)  fixed  points  of  r.  2"  +  l 
iterations  are  reciuired  in  the  worst  case,  where  n  is  the  number  of 
propositional  state  variables.  This  is  the  length  of  the  longest  possible 
strictly  increasing  (or  decreasing)  chain  of  subsets  of  5  (not  including 
the  empty  set),  plus  one  extra  iteration  to  detect  the  fixed  point.  In 
practice,  however,  the  number  of  iterations  required  to  reach  a  fixed 
point  can  be  ciuite  small. 


2.3.4  Binary  Decision  Diagrams 

It  should  be  <  l(>ar  that  to  make  the  symbolic  model  checking  techni<iue 
practical,  an  elficient  automated  method  foi  manii)ulatim>  Hooh’aii  for¬ 
mulas  is  r<*(|uired.  Fortunately,  a  variety  of  such  techni(|ues  have  been 
developed  for  the  purpose  of  synthesizing  digital  circuits  or  comparing 
the  functionality  of  digital  circuits.  These  techniques  may  involve  ap¬ 
plying  a  set  of  rewriting  rules  to  convert  a  given  formula  into  a  normal 
form.  ,\lt<’rnatively.  a  data  structure  may  be  usenl  to  represent  the  for¬ 
mula  as  a  Ihxdean  limction.'^  For  example,  a  Boolean  function  may  be 


.Normally,  when  we  (ii.scnssiiiK,  switching  fniu-tion.s.  we  think  of  a  Itoolean  for¬ 
mula  as  represeiiied  by  a  ftmclion  railier  than  the  set  of  satisfymn  truth  a,ssic;n- 
ments  1)1  the  lortiitila.  .\  Uoolean  formula  /  over  an  orilered  set  of  variables 
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represented  by  a  truth  table,  or  by  a  set  of  “cubes"  which  cover  the 
truth  table  of  the  function,  or  by  a  binary  decision  tree.  Each  repre¬ 
sentation  has  associated  procedures  for  applying  Boolean  operations. 
.\ny  method  of  manipulating  Boolean  formulas  that  can  implement  the 
operations  p  A  q,  p  V  q.  ->/),  3V'.p  and  p(V  ^  V')  can  be  used  in  sym¬ 
bolic  model  checking.  By  far  the  most  effective  method  known  to  date, 
however,  is  the  Ordered  Binary  Decision  Diagram  method  developed 
by  Bryant  [Bry86]. 

Ordered  Binary  Decision  Diagrams  are  a  form  of  reduced  decision 
graph  that  give  compact  canonical  representation  for  Boolean  formu¬ 
las.  They  have  been  used  extensively  for  comparison  of  switching 
functions  [BBB‘*‘87.  FB89].  The  OBDD  canonical  representation  for  a 
Boolean  function  can  be  derived  by  reducing  a  related  structure  called 
an  ordered  decision  tree.  In  an  ordered  decision  tree,  the  value  of  the 
function  is  obtained  by  descending  the  tree  from  the  root  to  a  leaf.  .At 
each  node  along  the  path,  one  descends  to  the  left  child  if  the  value  of 
the  variable  labeling  the  node  is  0.  and  to  the  right  child  the  value  is  1. 
Each  leaf  of  the  tree  is  labeled  with  a  value  0  or  1  which  gives  the  result 
of  the  function.  The  tree  is  said  to  be  ordered  if  the  variables  always 
occur  in  the  same  order  along  any  path  from  root  to  leaf.  In  this  case, 
reading  the  leaves  from  left  to  right,  one  obtains  the  truth  table  of  the 
function. 

As  an  example,  an  ordered  decision  tree  for  the  function  a  A  by  c  Ad 
is  depicted  in  figure  2.1. 

The  canonical  OBUD  form  is  a  directed  acyclic  graph  which  can  be 
obtained  from  the  ordered  decision  tree  l)y  the  following  two  steps: 

1.  Combine  any  i.somorphic  subtrees  into  a  single  tree. 

2.  Eliminate  any  nodes  who.se  left  and  right  children  are  isomorphic. 

Steps  1  and  2  can  be  applied  in  a  bottom  up  fashion,  to  yield  the 
canonical  OBDD  representation  in  linear  time.  Bryant  called  this  op¬ 
eration  Reduce.  I  he  size  of  the  resulting  graph  is  strongly  dependent 

V  =  {(  ) . r,i }  induces  a  funclion  /  ;  {0.1}'*  —  {0.1}  in  the  obvious  w.ay; 

/(it  I . x„)  =  1  iff  the  I  null  assignment,  (e,  —  x, )  satisfies  /  Tlie  two  views 

are  efjuivalent.  but  the  functional  representation  seems  to  lie  more  standard  iti  the 
I'ontext  of  lloolean  manipulation. 
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Figure  2.1;  Ordered  Decision  Tree 


on  the  order  of  the  variables.  This  variable  ordering,  however,  is  the 
key  to  obtaining  the  reduced  form.  Fhis  is  what  distinguishes  OBDDs 
from  the  more  general  class  of  Binary  Decision  Diagrams  described  by 
.4kers  [Ake78]. 

.As  an  illustration  of  reduction  to  canonical  form,  consider  the  the 
ordered  decision  tree  of  figure  2.1.  The  three  nodes  marked  are 
roots  of  isomorphic  subtrees.  Thus,  they  can  be  combined  into  a  single 
subtree.  In  addition,  from  the  node  marked  one  arrives  at  the 

same  subtree  when  descending  to  the  left  or  right  ( ie..  independently  of 
the  value  of  h).  hence  this  verte.x  does  not  affect  the  value  of  the  function 
anti  may  be  eliminated.  The  result  of  applying  the  Reduce  operation 
to  the  tree  of  of  figure  2.1  is  depicted  in  tlepicted  in  figure  2.2.  .Vote 
the  significant  retluction  in  the  number  of  vertices,  resulting  essentially 
from  reflundancy  in  the  truth  table  ot  the  lunction. 

The  canonical  OBDDs  are  a  subclass  of  DAGs  (directed  acyclic 
graphs)  where  each  leaf  is  labeled  by  0  or  1.  and  each  non-leaf  is  labeled 
by  a  variable.  It  is  most  convenient  to  define  this  class  intluctively.  by 
building  large  D.AGs  from  smaller  ones.  For  this  reason,  we  will  number 
the  variables  from  the  bottom  up.'^  In  the  seciuel.  the  term  dimension 
will  be  u.sed  to  denote  the  highest  variable  inde.x  occurring  in  a  DAG. 

'I  nforMiri.vicly,  t.liis  i.s  the  opposite  ot  the  mimhering  ailopted  by  Bryant,  hut  it 
makes  the  proofs  i^e.-irer. 
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0  1 

Figure  2.2:  Ordered  Binary  Decision  Diagram 


VVe  will  simultaneously  define  the  class  of  DAGs  which  are  canonical 
OBDDs  and  the  functions  they  denote,  by  induction  on  the  dimension: 

Definition  1  Let  V  be  an  n-tupie  {vi,V2,  -  ■  ■  ,Vn)  of  variables.  The 
cla.ss  OBDD{V)  consists  of  the  terminals  0  and  1.  and  n  collection  of 
triples  in  S  x  OBDD{S)  x  OBDD{S)  called  non-terminals.  With  each 
tlenient  p  of  OBDD{  \  ).  ire  associate  a  dimension  dp.  where  0  <  r/p  < 
n.  and  a  Boolean  function  fp  :  Z?"  ^  B.  The  cla.ss  OBDDiS)  is  the 
least  such  that,  for  all  .r  6  {0.  1  }".■ 

/.  0  G  OBDD{\  ).  ill)  =  0.  and  ji)(.r)  =  1). 

2.  1  €  OBDD(V).  </,  =  !.  and  /i(.r)  =  1. 


if  I  and  h  are  distinct  ilements  of  OBDD{V).  where  di  <  i  <  n 
and  dfi  <  ^  Ihtn  the  triple  r  =  (r,,l.h)  is  also  in  OBDD(V). 
dy.  =  i  and 


f.-{-r) 


//,(.,•)  if.r,  =  Q 

t/t'i  =  1 


With  regard  to  canonicity.  the  .salient  aspects  of  the  above  definition 
are  that  a  triple  {r,.l.h)  is  a  canonical  OBDD  only  if  /  and  h  are 
distinct  and  t  is  greater  than  the  dim«*nsions  of  /  and  li  (the  variable 


44 


CHAPTER  2.  SYMBOLIC  MODEL  CHECKING 


ordering  requirement  One  important  consequence  of  this  is  that  /p, 
the  function  represented  by  a  DAG  p,  does  not  depend  on  any  variables 
of  index  greater  than  dp: 

Lemma  1  For  all  p  €  OBDD{V),  for  all  dp  <  i  <  n.  0)  = 

fp{vi  ^  D- 

Proof.  By  induction  over  dp.  VVe  assume  the  statement  of  the 
theorem  holds  for  all  q  such  that  d,  <  dp.  The  terminal  cases,  p  =  0 
and  p  =  I  are  trivial.  For  the  non-terminal  ctise.  [et  p  =  {Vj.l.h).  where 
j  <  i.  .\ow  consider  two  cases.  Vj  =  0  and  Uj  =  1.  In  the  first  case. 
/p(y,  V-  0)  =  //(u,  <-  0)  and  fp(v,  <—  1)  =  fi(o,  1).  These  are  equal 
by  inductive  hypothesis,  since  d;  <  i.  The  other  case.  Vj  =  1  is  similar, 
with  fh  for  /(.  □ 

It  is  not  difficult  to  show  that  OBDDs  canonically  represent  the 
Boolean  functions.  That  is,  each  Boolean  function  is  represented  by 
exactly  one  OBDD.  VVe  show  first  that  there  are  no  two  distinct  OB¬ 
DDs  representing  the  same  function,  and  second,  that  every  Boolean 
function  is  represented  by  some  OBDD.  The  following  theorem  is  es¬ 
sentially  due  to  Bryant  [Bry86],  although  the  formalization  is  different, 
and  as  a  result,  it  is  hoped,  the  proof  is  substantially  simpler. 

Theorem  5  (Bryant)  Ij  p  and  p'  are  elements  of  OBDD{V) .  then 
fp  =  .fp'  implies  p  =  p' 

ProoJ.  By  simultaneous  induction  over  d,,  and  d',.  We  assume  the 
statement  of  the  theorem  holds  for  all  </  and  q'.  where  d.,  <  dp  and 
d.ji  <  dpi.  Suppose  that  fp  =  fpi: 

(Jonsider  first  the  case  where  dp  =  dpi.  Either  p  and  <i  are  both 
terminals,  (in  which  case  p  =  p'  =  0  or  p  =  //  =  1)  or  they  are  both 
non-terminals,  p  =  (i;,, /. /i)  and  p'  =  {v,,l'.h').  For  non-terminals,  we 

"’Tliert-  is  an  alternative  formulation  of  OBDDs  due  to  Clarke  [K(  '90)  vvliicli  does 
not  rp(iiMre  /  and  h  to  be  distinct,  but  refjuires  that  i  =  di  +  [  =  di,  +  1,  In  this  case, 
the  OBDD  for  a  function  /  is  e.xactly  the  minimal  DFA  recognizing  the  language 
{z:  €  {0.  1}"  I  /{£)  =  1}.  Thinking  of  OBDDs  jis  minimal  DF.Vs  can  provide  useful 
insights  into  the  comple.xity  of  representing  certain  chusses  of  functions  as  tJBDDs. 
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have  Ji  =  fp{i\  *-  0)  =  ^  0)  =  ///,  and  similarly  fh  =  jp[v,  ^ 

[)  =  ■«—  1 )  =  fk'.  Hence,  by  induction.  1  =  1'  and  h  =  h' .  so 

P  =  P'- 

Second,  consider  tlie  case  where  dp  >  dp,.  It  follows  that  p  is  a 
non-terminal  (c,. /. /t).  Further,  by  the  previous  lemma.  fp'(vi  *—  0)  = 
fp'{i\  +-  1).  Therefore.  fp[i.\  ^  0)  =  /p(e,  1),  so  J]  =  j\.  By 

induction,  then.  I  =  h.  This  is  a  contradiction,  however,  since  if  /  and 
h  are  not  distinct,  then  p  is  not  in  OBDD(F  ). 

.A  symmetric  argunient  applies  to  the  case  dp  <  dp,.  □ 


Theorem  6  Given  a  function  f  :  B.  there  exists  p  €  OBDD{V) 

such  that  fp  =  f. 

Proof.  By  induction  on  /.  the  greatest  number  such  that  f(c,  «— 
0)  7^  f{vi  <—  1).  By  inductive  hypothesis,  there  e.xist  q  and  r  in 
OBDD(V')  such  that  /,,  =  /(r,  <—  0)  and  fr  =  f(vi  1).  Further,  q 
and  r  are  distinct,  since  f(v,  *—  0)  ^  f{vi  <—  1).  Thus,  let  p  =  (c,. </.  r). 
□ 


Because  each  function  is  represented  by  a  uniciue  OBDD.  testing 
two  OBDDs  for  functional  equality  can  be  accomplished  in  constant 
time.  This  property  of  OBDDs  is  useful  for  determining  when  a  H.xed 
point  has  been  reached  in  the  standard  fixed  point  algorithm. 

The  Ipp/p  algorithm 

Bryant  describes  an  .ilgoriiliin  called  .Ipp/p.  which  applies  an  arbitral^- 
Boolean  operation  •  to  two  OBDDs.  The  operation  •  can  Ix’  any  of 
the  16  Boolean  functions  of  two  variables  -  .Ipp/j/ computes  the  natu¬ 
ral  extension  of  •  to  two  Boolean  functions.  Given  two  non-terminal 
OBDDs  p  and  p.  the  Ipp/// algorithm  breaks  the  problem  of  computing 
;•  =  p  •  r/  into  two  subproblems  on  the  children  of  p  and  q. 

Take  first  the  ca.se  where  r/,,  =  d,,.  Let  p  =  (e,, /p. and  q  = 
{ I’t.  l^.  h.j).  It  is  easily  shown  that 

•  r{  r,  <—())=  p(  e,  0)  •  r/(  r,  ^  ())  =  f,  •  /,,  and 
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•  r(v,  1)  =  p(vi  1)  •qivi  <—  1)  =  /ip  •  /i,. 

Thus,  we  create  two  subproblems  /  =  Ip  •  l,j  and  h  =  hp  •  h^.  On 
the  other  hand,  suppose  that  p  =  {vi.lpjip)  and  q  =  where 

i  >  j.  In  this  case.  q{o,  <—  0)  =  </({;,  1)  =  q.  So. 

•  r{  0)  =  p{  I'i  <—  0)  •  <7  =  /p  •  c/  and 

•  /■(  c,  *-  1 )  =  p{i\  *-  [)  •  q  =  lip»  q. 

Therefore,  we  create  two  subproblems  I  =  Ip  •  q  and  h  =  hp  •  q.  The 
remaining  case,  i  <  j.  is  symmetric. 

The  subproblems  are  solved  recursively  to  obtain  /  =  r(  iq  <—  0)  and 
h  =  r(r,  <—  1).  From  these  two  cofactors,  we  can  derive  r.  If  /  and 

h  are  eciual.  then  r  =  h  =  1.  If  they  are  distinct,  then  r  =  (v,,Lh). 

Finally,  if  p  and  q  are  both  terminals.  Apply  simply  uses  the  truth  table 
for  •. 

Since  each  subproblem  of  dimension  d  can  generate  two  subproblems 
of  dimension  d  —  1,  it  might  seem  that  this  algorithm  is  exponential.  It 
can  be  made  polynomial,  however,  by  applying  dynamic  programming. 
Notice  that  each  subproblem  is  determined  by  a  pair  of  OBDDs  p'  and 
q'  which  are  descendants  of  p  and  q  respectively.  Hence,  the  maximum 
number  of  distinct  subproblems  is  the  product  of  the  size  of  p  and  the 
size  of  q.  By  keeping  a  hash  table  of  triples  (p,q,r),  we  can  reduce 
the  number  of  recursive  calls  to  |p|  •  |(/|.  Bryant  shows  that  this  upper 
bound  is  tight,  since  there  exist  functions  p  and  q  for  which  the  size  of 
c  is  IpI  ■  \q\. 

The  (  ornpo.'if:  algorithm 

Bryant  also  gives  an  algorithm  called  Compose,  which  computes  p{i\  «— 
q).  where  p  and  q  are  OBDDs.  and  c,  is  a  variable.  The  algorithm  is 
easily  adapted  for  simultaneous  substitution  of  a  vector  of  variables. 
Hence,  given  that 


3v,.p  =  p(r,  ♦-  0)  V  p(r,  V-  1 ).  (2.18) 

the  compose  procedure  could  be  used  to  implement  both  the  variable 
substitution  operation  p{V  *—  \'')  and  the  existential  <|uantiHcation 
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operation  3V'.p  needed  for  symbolic  model  checking.  On  the  other 
hand,  a  much  more  efficient  procedure  can  be  obtained  by  combining 
the  c|uantification  and  conjunction  operations  in  theexpression  for  E.Xp 
into  a  single  OBDD  operation  computing  3V'.(p  A  q).  Applying  the 
quantifiers  in  a  bottom-up  fashion  as  the  conjunction  subproblems  are 
solved  results  in  a  substantial  reduction  in  the  size  of  the  intermediate 
results  by  reducing  the  number  of  variables. 

The  .{ndExists  algorithm 

This  algorithm,  which  vve  will  call  .[ndExists  is  basically  a  modification 
of  .Apply.  Let  r  be  the  OBDD  representing  the  function  31'. (p  A  q). 
VVe  compute  r  by  generating  subproblems  /  and  h  in  the  same  manner 
as  if  using  the  .Apply  algorithm  for  •  =  A.  When  the  results  of  the 
subproblems  are  obtained,  if  the  leading  variable  e,  is  a  component  of 
V'.  the  result  is  r  =  /  V  /;  (see  equation  2.18).  This  result  is  obtained 
by  calling  Apply  with  •  =  V.  On  the  other  hand,  if  y,  does  not  occur 
in  V".  then  the  result  is  the  same  as  for  Apphf.  if  /  =  h.  then  r  =  I  =  h. 
else  r  =  (y,',  /,  h). 

Th*^  motivation  for  this  algorithm  is  to  avoid  producing  the  entire 
OBDD  for  pAq.  which  has  2n  variables,  where  n  is  the  number  of  state 
variables  of  the  model.  This  is  done  by  applying  existential  quantifi¬ 
cation  to  the  results  of  subprol)lems  as  soon  as  they  become  available. 
}'ielding  a  result  with  only  n  variables.  Empirically,  this  provides  a 
substantial  savings  in  space. 

As  in  the  .Apply  algorithm,  a  table  of  triples  (/».  (/,  r)  is  used  to  avoi<l 
resolving  previously  computed  sul)prol)lems.  The  maximum  size  of  this 
table  is  Ipj  ■  \q\.  However,  unlike  in  the  .Apply  algorithm,  the  recursive 
calls  cannot  be  executed  in  constant  time.  This  is  liecause  each  call 
may  require  a  V  operation  to  l)e  performed.  .\t  present,  the  author 
is  unaware  of  a  bound  on  the  complexity  of  AndExtsts  better  than 
Of  Ip)  ■  l^l  •  2^”).  which  is  simply  the  number  of  V  ()roblems  to  be  solved 
(  IpI  •  I7I  in  the  worst  ca.se)  t  imes  the  square  of  the  largest  possible  OBDD 
size.  2".  In  practice,  this  number  of  operations  has  not  been  observerl. 
so  one  might  conjecture  that  there  is  a  tighter  bound.  It  serins  unlikely 
that  a  polynomial  boutul  will  be  foinni.  however,  since  it  is  easilv  shown 
that  if  vector  existential  (plant  itication  on  OBDDs  can  lx*  computed  in 
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Figure  2.8:  Variable  ordering  for  3-SAT  reduction 
polynomial  time,  then  P  =  .\’P. 

The  proof  of  this  is  by  reduction  from  8-SAT.  as  follows:  Let  /  = 
<  1  A  A  •  •  •  /fc  be  a  8-SAT  formula,  that  is.  f,  =  (x,  V  y,  V  c, ),  where  x^, 
ij,  and  c,  are  positive  or  negative  literals.  The  OBDD  representation  of 
each  t,  has  no  more  than  8  non-terminals.  Now  introduce  new  variables 
V''  =  (e'p  Uji  •  •  •  •  ^k)‘  corresponding  to  the  terms  of  /,  and  let 

/' =  V  /\  -e') 

i<i<fc  i<j<i 

For  a  suitable  variable  ordering,  the  OBDD  representing  /'  has  no  more 
than  U'  non-terminals  (see  figure  2.8).  hence  can  be  built  in  polynomial 
time.  Fhe  formula  /  is  satisfiable  iff  3V’'./'  ^  1.  Thus,  if  3V'.f'  can  be 
computed  in  polynomial  time,  then  P  =  .NP. 

.\s  an  aside,  it  is  not  difficult  (though  a  l)it  tedious)  to  show  that  the 
symbolic  CTL  model  checking  problem  is  PSPACE-complete.  To  show 
PSP.\(  F-hardness.  one  starts  with  a  ijolynomial  space  bounded  Turing 
machine,  introduces  a  sufficient  number  of  Boolean  variables  to  encode 
the  entire  tape,  plus  the  pointer  and  the  hnite  control,  then  e.xpresses 
the  transition  relation  of  the  entire  system  as  a  QBF  formula.  To  show 
that  the  problem  is  in  PSP.VCE.  one  can  show  that  the  problem  can 
be  reduced  to  satisfiability  of  a  QBF  formula  of  polynomial  size,  using 
the  "iti'iative  s((uaring"  technic|ue  of  Burch,  et  nl.  [B( '\I'''!)0].  Details 
are  left  to  the  '-'‘ader. 
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2.4  Examples 

Although  the  worst  case  complexity  of  symbolic  model  checking  is  high 
(using  OBDDs  or  other  Boolean  function  representations),  in  practice 
the  worst  case  complexity  is  rarely  achieved,  and  the  symbolic  technique 
can  in  some  cases  be  dramatically  more  efficient  than  previous  methods. 
.\s  an  illustration  of  this,  let's  look  at  two  hardware  examples  -  a 
synchronous  fair  bus  arbiter,  and  an  asynchronous  distributed  mutual 
(exclusion  ring  circuit  (the  one  studied  by  David  Dill  in  his  thesis  [DilS8] 
and  designed  by  .Alain  .Martin  [MarS5]). 

2.4.1  Synchronous  state  machines 

For  a  synchronous  finite  state  machine,  the  transition  relation  can  be 
given  as  a  conjunction  of  Boolean  formulas,  each  determining  the  new 
state  of  one  register  as  a  function  of  its  old  state  and  the  inputs.  Let 

V'  =  {vi,  (.’2, _ be  a  set  of  Boolean  variables  representing  the  state 

of  the  registers  in  the  circuit,  and  let  W  =  (eei,  W2, ....  Wm }  be  a  set 
of  variables  representing  the  values  of  the  inputs  to  the  circuit.  For  all 
<  =  1  . . .  n,  let  /,[V.  W\  define  the  value  of  register  i  in  the  next  state, 
in  terms  of  V  and  W.  The  transition  relation  of  the  state  machine  can 
be  expressed  as  a  Boolean  formula  in  the  following  form; 

n 

R.=  /\R,.  where  R,  =  (r[  /,[V'.  IF]).  (2.19) 

;=l 

In  general,  for  models  of  synchronous  systems,  the  transition  relation 
is  a  conjunction  of  formulas  representing  the  individual  components 
of  the  system,  si'ice  ♦ran.'^itions  of  the  components  are  simultaneous. 
The  outputs  of  the  state  machine  can  be  given  as  Boolean  functions  of 
the  inputs  and  registers.  These  functions  can  be  substituted  for  atomic 
propositions  in  C'TL  formulas,  so  there  is  no  need  to  introduce  variables 
to  represent  the  outputs. 

As  an  example  of  a  synchronous  state  machine,  we  will  consider 
a  synchronotis  bus  arbiter  circuit.  The  purpose  of  the  bus  arbiter  is 
to  grant  access  on  each  clock  cycle  to  a  single  client  among  a  number 
of  clients  conteruling  lor  the  ii.se  of  a  bus  (or  other  re.source).  Die 
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override  out  grant  in 


Figure  2.4:  Ceil  of  synchronous  arbiter  circuit 


inputs  to  the  circuit  are  a  set  of  request  signals  req^ . . .  reqf._y.  and  the 
outputs  are  a  set  of  acknowledge  signals  acko  . . .  ackk-i-  Normally,  the 
arbiter  asserts  the  acknowledge  signal  of  the  requesting  client  with  the 
lowest  index.  However,  as  requests  become  more  frequent,  the  arbiter  is 
designed  to  fall  back  on  a  round  robin  scheme,  so  that  every  requester  is 
eventually  acknowledged.  This  is  done  by  circulating  a  token  in  a  ring 
of  arbiter  cells,  with  one  cell  per  client.  The  token  moves  once  every 
clock  cycle.  If  a  given  client’s  request  persists  for  the  time  it  takes  for 
the  token  to  make  a  complete  circuit,  that  client  is  granted  immediate 
access  to  the  !)us. 

The  Ijasic  cell  of  the  arbiter  is  depicted  in  figure  2.4.1.  This  cell 
is  repeated  k  times,  as  shown  in  figure  2.1.1.  Each  cell  has  a  request 
input  and  an  acknowledge  output.  The  grant  output  of  cell  i  is  passed 
to  cell  ;  4-  1.  and  indicates  that  no  clients  of  index  less  than  or  equal 
to  i  are  requesting.  Hence,  a  cell  may  assert  its  acknowledge  output 
if  its  grant  input  is  asserted.  Each  cell  has  a  register  T  which  stores 
a  one  when  the  token  is  present.  The  T  registers  form  a  circular  shift 
register  which  shifts  up  one  place  each  clock  cycle.  Each  cell  also  has  a 
register  IF  (for  "waiting")  which  is  set  to  one  when  the  rec|uest  input 
is  asserted  and  the  token  is  present.  The  register  remains  set  while  the 
request  [)ersist.s.  until  the  token  returns.  .\t  this  time,  the  cell's  override 
and  acknowl('dge  outputs  are  asserted.  The  override  ,>ignal  propagates 
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Figure  2.5:  Configuration  of  the  synchronous  arbiter  circuit 

through  the  cells  below,  negating  the  grant  input  of  cell  0,  and  thus 
preventing  any  other  cells  from  acknowledging  at  the  same  time.  The 
circuit  is  initialized  so  that  all  of  the  W  registers  are  reset  and  exactly 
one  T  register  is  set. 

The  desired  properties  of  the  arbiter  circuit  are: 

1.  No  two  acknowledge  outputs  are  asserted  simultaneously 

2.  Every  persistent  request  is  eventually  acknowledged 
■i.  .\cknowledge  is  not  assertefl  without  recpiest 

Expressed  in  CTL.  they  are: 
f-  ''lG“’(ack,  A  ackj) 

2.  A, -dG'.'lFfreci,  =;■  ack,) 

A,  AGihck,  =>  rcfiJ 

Using  the  symbolic  CTL  model  checking  procedure,  we  can  deter¬ 
mine  whether  the  flesign  has  these  properties,  for  a  given  number  of 
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cells.  Figure  2.6  plots  the  performance  of  the  symbolic  model  check¬ 
ing  procedure  for  this  example  in  terms  of  several  measures:  the  size 
of  the  transition  relation  in  OBDD  nodes,  the  total  run  time  (on  a 
Sun3.  running  an  implementation  in  the  C  language),  and  the  maxi¬ 
mum  number  of  OBDD  nodes  used  at  any  given  time.^‘  VVe  observe 
that  as  the  number  of  cells  in  the  circuit  increases,  the  size  of  the  tran¬ 
sition  relation  increases  linearly  (in  section  2.5.  vve  will  prove  a  theorem 
that  shows  why  this  is  the  case).  The  e.xecution  time  is  well  fit  by  a 
ciuadratic  curve.  The  number  of  reachable  states,  however,  explodes 
exponentially  (note  the  logarithmic  scale  on  the  reachable  states  axis). 

To  obtain  polynomial  performance  for  this  example,  it  was  necessary 
to  add  a  wrinkle  to  the  symbolic  model  checking  algorithm.  In  the  first 
experiment  it  was  found  that  although  most  of  the  specification  was 
checked  c|uickly.  the  time  required  to  check  property  2  for  cell  0  doubled 
each  time  a  cell  was  added.  The  reason  for  this  is  rather  remarkable. 
Consider  a  function  called  Rotate,  which  returns  true  for  a  pair  of  n 
bit  binary  numbers  when  one  number  can  be  obtained  from  the  other 
by  a  rotation  of  j  bits.  There  is  no  variable  ordering  which  yields  an 
efficient  OBDD  for  this  function  for  all  In  fact,  a  very  similar 
function  occurs  in  computing  the  set  of  states  satisfying  the  formula 
TFfreqQ  =>  acko),  where  the  two  binary  numbers  are  given  by  the 
W  and  T  registers  respectively.  Note  that,  for  cell  0.  request  implies 
acknowledge  exactly  when  no  other  cell  has  both  \V  and  T  registers 
set.  The  T  registers  rotate  once  per  clock  cycle.  Thus,  req^  =>  acko  is 
necessarily  true  j  steps  in  the  future  e.xactly  when  there  is  no  other  cell 
i  for  which  fF,  A  T-jmodk-  The  OBDD  representing  this  set  of  states 
grows  exponentially  in  the  number  of  cells. 

This  illustrates  a  fairly  general  phenomenon:  circuits  '.end  to  l)e 
"well  behaved"  in  the  part  of  their  state  space  which  is  reachable  from 
the  initial  state,  but  not  elsewhere.  In  the  case  of  the  synchronous 
arbiter,  only  states  with  one  T  register  set  are  reachable.  However. 

“The  latter  number  should  be  regarded  :is  being  accurate  only  to  within  a  factor 
of  two.  since  the  garbage  collector  in  the  implementation  scavenges  for  unreferenced 
nodes  only  when  the  number  of  nodes  doubles. 

'"This  can  be  shown  using  the  technique  of  [Bryhl].  It  is  sufficient  that  for 
any  variable  order  there  is  some  rotation  such  that  when  the  order  is  cut  m  half, 
information  proportional  to  n  must  be  pas.sed  from  one  half  to  the  other. 
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the  symbolic  model  checking  technique  considers  all  states,  including 
states  with  multiple  tokens.  A  good  solution  to  this  problem  in  general 
is  first  to  compute  the  set  of  reachable  states,  and  then  to  restrict  all 
of  computations  of  the  CTL  model  checking  algorithm  to  those  states. 
Since  the  reachable  states  are  closed  under  the  transition  relation,  this 
has  no  effect  on  the  truth  value  obtained  for  formulas  at  the  initial 
state.  In  particular,  this  solves  the  problem  of  the  bus  arbiter  circuit, 
since  in  its  reachable  state  space,  the  T  registers  cannot  represent  an 
arbitrary  binary  number. 

The  set  of  reachable  states  is  the  least  fixed  point  of 

t[Y]  =  ly  R(Y) 

where  [  is  the  set  of  initial  states,  .\pplying  the  standard  fixed  point 
algorithm  in  this  case  effectively  yields  a  forward  breadth  first  search  of 
the  state  space.  By  computing  the  reachable  states  first  and  then  using 
this  set  to  restrict  the  CTL  model  checking  algorithm,  we  obtain  the 
polynomial  run  time  results  described  above.  This  technique  is  also 
used  for  other  experiments  described  in  the  sequel,  unless  otherwise 
noted. 

2.4.2  Asynchronous  state  machines 

In  an  asynchronous  state  machine,  there  is  no  global  clock  to  which 
all  state  changes  are  synchronized.  This  makes  designing  correct  asyn¬ 
chronous  circuits  considerably  more  challenging  than  designing  correct 
synchronous  circuits.  We  will  consider  two  plausible  models  of  asyn¬ 
chronous  state  tnachines.  In  the  first,  which  we  will  call  tl.c  ^imultn- 
iif-ous  model,  any  or  all  state  variables  may  change  state  in  a  given 
transition.  Each  state  component  makes  an  independent  and  non- 
fletermiiiistic  choice  regarding  whether  to  change  value  or  not.  In  the 
second  model,  which  we  will  call  the  interleaving  model,  only  one  state 
component  changes  value  in  a  given  transition.  The  choice  of  which 
component  changes  value  is  non-deterministic.‘^  In  either  model,  we 

(liscii.ssioii  of  which  state  machine  model  is  more  suitable  I'or  circuit  design 
is  beyond  the  scope  of  this  work.  In  general,  conditions  would  have  to  be  imposed 
on  either  model  in  order  to  make  it  impiementable  in  a  given  design  style.  For 
discussion  of  asynchronous  design  terhni<iues.  see  [.MB.')!).  SeiHOb). 
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consider  an  asynchronous  state  machine  composed  of  n  gates.  VVe  will 
use  state  variable  v,  to  stand  for  the  output  of  gate  i.  and  fi[V.W]  to 
represent  the  function  computed  by  gate  i  (where  V'  is  the  set  of  state 
variables,  and  W  the  set  of  inputs). 

In  the  simultaneous  model,  the  transition  relation  can  be  repre¬ 
sented  by  a  formula  in  the  form: 

R=  f\  R,,  where  =  <?=>  /,[V.  fV'])  V  (e'  <=»  u,).  (2.20) 

For  any  transition  and  any  state  variable  e,,  either  the  new  value  of 
r,  is  determined  by  /,[V'.  li’j.  or  it  is  the  same  as  the  old  value.  Note 
that  this  differs  from  the  synchronous  model  (2.19)  in  which  every  state 
variable  is  reevaluated  at  every  transition. 

In  the  interleaving  model,  the  transition  relation  can  be  represented 
by  a  formula  in  the  form: 

y  R„  where /?,  =  (u'  j\[VAV])  A  Vj))  (2.21) 

I 

In  any  transition,  for  .some  state  variable  Vi,  the  new  value  of  u,  is 
determined  by  f,[VAV\,  and  the  remaining  variables  keep  their  old 
value.  Note  that  in  this  case,  the  transition  relation  is  represented 
by  a  disjunction  of  component  relations  rather  than  a  conjunction. 

In  general,  for  models  of  parallel  processes  whose  actions  interleave 
arbitrarily,  the  transition  relation  is  disjunctive.  If  this  is  the  case,  we 
can  make  an  easy  (optimization  in  the  symbolic  model  checking  tech- 
ni(|ue:  we  observe  that  the  set  of  states  reachable  by  one  step  ol  the 
svstem  is  the  union  of  the  sets  of  states  reachable  by  one  step  of  each 
individual  component.  I'liis  is  rellected  in  the  fact  that  existential 
(luantification  distributees  over  disjtmction.  Thus: 

E.Xj)  =  3r'.  ((V  R,)  A  p{V  r')) 

I 

=  V3V''.  (/?,  A/)(V  ^  V  ')) 

I  sing  this  ecjualitv.  we  can  avoid  computing  the  transition  relation  of 
the  svsle'rn  and  instead  use  onlv  the  transition  relations  ot  the  d'.di- 
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Figure  2.7:  One  cell  of  the  DME  circuit 

vicinal  processes.  This  technicpie  is  called  early  C|uantification*'‘  -  by 
rearranging  the  computations,  we  apply  cpiantification  before  the  logi¬ 
cal  disjunction  operation.  Heuristically,  quantification  tends  to  reduce 
OBDD  size,  since  it  reduces  the  number  of  variables.  Hence,  the  size 
of  the  intermediate  results  is  usually  reduced  (though  the  final  result  is 
the  same). 

Our  example  of  an  asynchronous  state  machine  is  the  distributed 
mutual  exclusion  (DME)  circuit  of  Alain  Martin  [MarSo].  It  is  a  speed- 
independent  circuit  [SeiSOb]  and  makes  use  of  special  two-way  mutual 
exclusion  circuits  as  components.  Figure  2.7  is  a  diagram  of  a  single  cell 
of  the  distributed  mutual-exclusion  ring.  The  circuit  works  by  passing 
a  token  around  the  ring,  via  the  request  and  acknowlerlge  signals  Ir 
and  la  on  the  left  and  rr  and  rn  on  the  right.  .\  user  of  the  DME  gains 
exclusive  access  to  the  resource  via  the  request  and  acknowledge  signals 
iir  and  iia. 

The  specifications  of  the  DME  circuit  are  as  follows: 

1.  .\o  two  users  are  acknowledged  simultaneously. 

2.  An  acknowledgment  is  not  output  without  a  recpiesl. 

5.  An  acknowledgment  is  not  removed  while  a  recpiest  persists. 

AndExists  algoritlim  of  section  2.15.1.  whicli  ronibiiKJs  conjunction  and 
<|uantiHcation  in  a  bottom-np  tnaiiner  is  al.so  ai>  •■xample  ol  I'arly  i|naniiHcation. 
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4.  All  requests  are  eventually  acknowledged. 

VVe  will  consider  only  the  first  specification,  regarding  mutual  exclusion. 
The  others  are  easily  formulated  in  CTL,  although  the  Icist  requires  the 
use  of  fairness  constraints  (see  section  2.6.1)  to  guarantee  that  all  gate 
delays  are  finite.  The  formalization  of  the  mutual  exclusion  specifica¬ 
tion  is 

A  -IG-I  ua,-  A  uuj) 

.Now  let’s  look  at  the  performance  of  the  symbolic  model  checking 
algorithm  in  checking  this  formula,  for  both  a  simultaneous  and  an 
interleaving  model  of  the  circuit.  For  the  interleaving  model,  we  use 
the  early  quantification  technique.  Figure  2.8  plots  the  relative  per¬ 
formance  for  the  simultaneous  model  (method  1)  and  the  interleaving 
model  (method  2).  Part  (a)  shows  the  run  time  as  a  function  of  the 
number  of  DME  cells,  part  (b)  shows  the  total  storage  used  (measured 
in  OBDD  nodes)  and  part  (c)  shows  the  number  of  nodes  used  to  rep¬ 
resent  the  transition  relation.  For  the  moment,  disregard  the  curves  for 
method  3.  The  experiment  was  run  for  up  to  7  cells  of  the  simultaneous 
model  (limited  by  space)  and  up  to  10  cells  of  the  interleaving  model 
(limited  by  time).  Part  (b)  of  the  figure  shows  the  substantial  space 
advantage  of  the  interleaving  model,  and  from  part  (c).  we  can  see  that 
most  of  the  difference  is  accounted  for  by  the  savings  in  representing  the 
transition  relation  using  early  quantification.  In  both  cases,  the  space 
used  is  linear  in  the  number  of  cells.  However,  we  note  that  the  increase 
in  run  time  appears  to  be  cubic  for  the  simultaneous  model,  but  c[uartic 
for  the  interleaving  model.  It  would  seem  that  if  enough  storage  were 
available  to  continue  the  curve  for  method  I,  the  two  curves  would  meet 
in  the  neighborhood  of  10  cells. 

The  different  asymptotic  performance  for  the  simultaneous  ami  in¬ 
terleaving  models  can  Ite  understood  by  looking  at  the  OBDDs  that 
occur  in  the  fi.xed  [joint  iterations  computing  the  reachable  states.  Fig¬ 
ure  2.0  plots  the  size  of  the  largest  such  OBDD  for  each  method.  We 
can  see  clearly  that  the  size  is  increasing  linearly  for  the  simultane¬ 
ous  model,  but  <|uadratically  for  the  interleaving  model.  This  is  a 
[jhenomenon  which  occurs  generally  when  cotnparing  simultaneous  c.s 
interleaving  inod«'ls.  It  can  be  understood  Ijv  considering  a  very  simple 
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system  composed  ot  n  processes,  each  with  states  0  and  1.  and  each 
alternating  non-deterministically  between  these  two  states.  If  we  start 
the  system  with  all  processes  in  state  0,  what  do  we  observe  after  k 
steps?  In  the  simultaneous  case,  after  one  step,  all  possible  states  are 
reachable.  In  the  interleaving  case,  however,  after  k  steps,  all  global 
states  with  at  most  k  I’s  are  reachable.  This  is  a  symmetric  function. 
■As  Bryant  noted  [Bry86|,  all  symmetric  functions  can  be  represented  by 
a  quadratic  size  OBDDs.  The  symmetry  results  from  the  fact  that  in 
an  interleaving  model,  e.xactly  one  state  component  changes  in  a  given 
transition,  and  the  choice  is  arbitrary.  In  general,  after  k  steps  of  such 
a  model,  the  number  of  steps  taken  by  each  state  component  sums  to  k. 
Hence,  in  the  set  of  states  reachable  after  k  steps,  there  is  an  induced 
correlation  between  the  states  of  otherwise  independent  processes. 

The  simultaneous  model  appears  to  be  inferior  to  the  interleaving 
model  from  a  symbolic  model  checking  point  of  view,  owing  to  the  large 
amount  of  space  required  to  represent  the  transition  relation.  Most 
of  this,  however,  can  be  attributed  to  a  phenomenon  we  observed  in 
the  previous  e.xample:  systems  tend  to  be  well  behaved  only  in  their 
reachable  state  space.  In  the  symbolic  model  checking  technique,  we 
represent  the  transition  relation  over  the  entire  state  space.  .Although 
representing  only  the  reachable  transitions  might  be  more  efficient,  we 
seem  to  be  caught  in  Catch  22:  we  need  to  represent  the  transition 
relation  to  compute  the  set  of  reachable  states.  VVe  can  avoid  this 
problem  !>>■  incrementally  computing  only  as  much  of  the  transition 
relation  as  is  necessary  to  compute  the  next  iteration  of  the  fixed  point 
algorithm.  Recall  that  the  reachable  state  set  is  the  least  fixed  point  of 
r[y  ]  =  [  y  Hi)  ),  By  rearranging  the  fi.xed  point  computation  slightly, 
we  only  need  represent  R  correctly  for  those  transitions  (.r.//).  where  x 
is  on  tlie  "frontier”  of  the  search: 

^  ■^‘(false)  =  r'(false)  V /?(r'( false)) 

=  r‘(false)  V  /?(r'(false)  —  r‘~‘(false)) 

•At  each  iteration,  we  can  reevaluate  the  formula  /?  over  the  set  of  states 
r‘(lalse)  —  false).  This  can  be  tlone  by  restricting  each  subformula 
using  either  the  logical  and  or  using  the  Restrict  operator  of  Coud- 
ert.  .\Iadn'  and  Berthet  (see  section  2.8).  Tliis  results  in  a  secpience 
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of  approximations  to  the  transition  relation  which  are  substantially 
more  compact  than  the  complete  transition  relation,  although  we  must 
reevaluate  R  at  each  iteration,  rather  that  evaluating  it  once  at  the 
beginning.  We  will  call  this  method  3. 

In  part  (a)  of  figure  2.8.  we  see  that  the  time  used  by  this  method, 
while  still  cubic,  is  a  substantial  improvement  over  the  previous  method 
for  the  simultaneous  model  (method  1).  More  importantly,  the  space 
used  is  dramatically  improved,  allowing  a  model  with  a  larger  number 
of  cells  to  be  checked.  The  method  overtakes  the  interleaving  model  in 
run  time  at  about  8  cells,  owing  to  its  better  asymptotic  performance. 

Figure  2.10  plots  the  number  reachable  states  as  a  function  of  the 
number  of  cells  (the  numbers  are  indistinguishable  for  the  two  models). 
The  number  of  reachable  states  grows  exponentially  in  the  number 
of  cells,  though  not  as  rapidly  as  the  total  number  of  states,  which 
is  2*^”.  The  key  point  is  that  for  all  three  methods,  the  space  and 
time  necessary  for  the  symbolic  model  checking  method  is  polynomial 
in  the  number  of  cells.  Thus,  the  state  e.xplosion  problem  has  been 
avoided.  The  overall  time  complexity  of  O(n^)  for  the  simultaneous 
model  derives  from  three  factors:  a  linear  increase  in  the  transition 
relation  OBDD.  a  linear  increase  in  the  state  set  OBDDs  obtained 
in  the  fixed  point  iterations,  and  a  linear  increase  in  the  number  of 
iterations.  For  the  interleaving  model,  the  quadratic  increase  in  the 
state  set  OBDDs  results  in  an  overall  0(n‘)  time  complexity.  On  the 
other  hand,  the  number  of  reachable  states  increases  roughly  a  factor 
of  ten  with  each  added  cell. 

It  is  not  immediately  clear  that  either  the  interleaving  or  simulta¬ 
neous  model  is  preferable  in  getieral.  Interleaving  models  seem  to  be 
Ix'tter  when  the  number  of  asynchronous  proce.s.ses  is  small,  and  simul¬ 
taneous  when  the  numljei  is  large.  The  cache  consistency  protocol  of 
chapter  t  is  an  example  of  a  large  system  with  a  fairly  small  number  of 
complex  asynchronous  processes.  This  is  an  appropriate  application  of 
an  interleaving  model. 

The  polynomial  performance  of  the  .symbolic  tnodel  checking  algo¬ 
rithm.  in  spite  of  the  expotiential  increase  in  states,  makes  it  possible  to 
analyze  fairly  large  instantiations  of  the  two  example  circuits  (th.e  svn- 
chronous  arbiter  and  the  D.Ml'i  circuit).  It  should  be  possible  to  verify 
these  atid  similai  <  irciiits  for  any  reasonable  fix«'d  number  of  l■('lls.  I'his 
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begs  the  fiuestion  -  how  many  cells  tlo  we  need  to  analyze  to  be  guar¬ 
anteed  that  the  design  is  correct  for  any  number  of  cells?  Intuitively, 
for  sufficiently  large  n.  a  sequence  of  n  -f-  I  cells  shoukl  be  equivalent 
in  some  sense  to  a  sequence  of  n  cells.  But  in  what  sense  equivalent? 
This  problem  is  dealt  with  in  chapter  o.  where  we  consider  induction 
over  processes. 


2.5  Graph  width  and  OBDDs 

In  this  section,  we  consider  the  asymptotic  growth  of  OBDDs  repre¬ 
senting  certain  topological  classes  of  circuits.  This  analysis  e.xplains 
some  of  the  performance  results  of  the  previous  section. 

In  1989.  Berman  proved  a  bound  on  the  OBDD  size  needed  to  rep- 
re.sent  circuits  of  bounded  width.  .-\  circuit  has  bounded  width  if  its 
elements  can  be  arranged  in  a  linear  order  such  that  any  cut  through 
the  order  crosses  at  most  a  bounded  number  of  wires  w.  called  the  width 
of  the  circuit.  There  e.xists  a  variable  ordering  such  that  the  OBDD 
size  is  bounded  by  n2'^.  where  n  is  the  number  of  primary  inputs  of  the 
circuit.  This  result  applies  only  if  the  order  is  "topologicar’.  meaning 
essentially  that  the  direction  of  all  the  wires  follows  the  ordering.  Here, 
this  result  is  generalized,  to  show  that  if  iv/  bounds  the  number  wires 
through  any  cut  in  the  forward  direction,  and  Wr  bounds  the  number 
in  the  reverse  direction,  then  the  OBDD  size  is  bounded  by  . 

For  the  case  where  uv  =  0.  this  is  the  same  as  Berman's  result.  I’sing 
this  result,  we  can  linearly  bound  the  OBDD  representation  for  the 
transition  relation  of  circuits  like  the  arbiter  aiul  the  D.\IE  ring,  which 
have  linear  arrangements  with  a  bounded  number  of  wires  through  any 
erosss  section. 

Fujita  states  that  tree  circuits  using  only  .VND.  OR  and  XOR  gates 
have  linearly  bounded  OBDD  representations  [F\IK90l.  Here,  we  show 
that  a  more  general  class  of  circuits  with  bounded  “tree  width"  and 
arbitrary  function  elements  have  polynomially  boundefl  OBDDs.  The 
essence  of  the  argument  is  to  show  that  these  circuits  <  an  l)e  arranged 
in  a  linear  <;rrler  with  a  wirlth  that  is  logarithmic  in  the  tuimb<>r  of 
gates,  riiis  yields  a  Imund  on  the  OBDD  ^ize  which  is  polvnomial  ui 
t  he  number  of  gates. 
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2.5.1  Bounded  width  circuits 

Let  L  =  (C.  <)  be  a  linear  order  on  the  gates  of  a  circuit.  VVe  classify 
the  primary  inputs  and  outputs  of  the  circuit  as  special  instances  of 
gates  in  order  to  simplify  the  definitions,  and  assume  that  the  primary 
output  is  at  the  top  of  the  order.  Given  an  order  L.  we  will  say  that 
the  forward  cross  .'iccfton  of  the  circuit  at  gate  g  is  the  set  of  wires 
connected  to  an  output  of  some  gate  g\  and  an  input  of  some  gate  go 
such  that  g\  <  g  and  g  <  go.  The  reverse  cross  section  is  the  set  of 
wires  connected  to  an  output  of  some  gate  gi  and  an  input  of  some  gate 
g2  such  that  g2  <  g  and  g  <  g\.  We  assume  that  no  wire  is  connected 
to  the  outputs  of  two  distinct  gates,  so  these  two  sets  are  disjoint.  We 
also  assume  that  there  are  no  cycles  in  the  circuit,  to  insure  that  the 
circuit  computes  a  function.  The  order  L  is  said  to  be  topological  when 
all  of  the  reverse  cross  sections  are  empty. 

The  forward  width  of  the  circuit  under  order  L.  denoted  wj,  is  the 
maximum  size  of  the  forward  cross  section  at  any  gate  g.  Similarly,  the 
reverse  width  of  the  circuit  under  order  L.  denoted  Wr  is  the  maximum 
size  of  the  reverse  cross  section  at  any  gate  g. 

The  cross  section  of  an  OBDD  at  level  i  is  the  set  of  nodes  labeled 
with  variable  i’,.  .Note  that  in  this  section,  we  will  number  the  variables 
of  the  OBDD  from  the  top  down,  since  this  makes  the  proofs  simpler. 
The  width  (Up  of  an  OBDD  p  is  the  maximum  size  of  any  cross  section 
of  p.  The  size  of  an  OBDD  is  the  sum  of  the  sizes  of  its  cross  sections. 
Thus,  the  OBDD  size  if  bounded  by  n  ■  Wp.  where  n  is  the  number  of 
variables. 

It  is  easily  shown  that  the  size  of  tli"  cross  section  of  an  OBDD  at 
level  t  is  the  number  of  distinct  functions 

. )  —  Jp{-Vi . 

which  clepend  on  r,.  where  ,r  =  (.rj . r,_i)  is  a  Bvwlean  vector  and  /,, 

is  the  tunction  represenle<l  by  p.  This  observation  leads  to  the  following 
theorem  bounding  the  size  of  an  OBDD  in  t<‘rms  (jf  the  forward  and 
reverse  widths  of  th<’  circuit  it  represents: 

Theorem  7  fj  a  cin  uit  roiti puling  Junction  f  lia.-i  forward  width  Wf 
nnd  rr  verse  width  ir,.  Jar  >^oine  linear  order  /,.  then  there  is  an  OfiDD 
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F'iguie  2.11;  Proof  of  bounded  width  theorem 

p  representing  function  /  of  size  bounded  by  ,  where  n  t$  the 

number  of  inputs  of  the  circuit. 

Proof  .Associate  the  variables  of  the  OBDD  with  the 

inputs  of  the  circuit,  such  that  for  all  i  <  j,  v,  <  Vj.  VVe  can  bound 
the  size  of  the  <th  cross  section  of  the  resulting  OBDD  as  follows.  Let 

■f  =  (.i'l . c,_i)  be  a  Boolean  vector.  .Split  the  circuit  in  half  by 

choosing  any  gate  g  such  that  u,_i  <  y  <  f’,.  letting  }  be  the  Forward 
cross  section  at  g  and  Z  the  reverse  cross  section.  This  situation  is 
depicted  in  tigure  2.11.  For  any  given  value  of  x.  )  is  a  function  of 

Z.  and  this  function  determines  fr{>', . I'n).  The  number  of  Boolean 

functions  with  |Z|  inputs  and  |V’l  outputs  is  2^'  (to  see  this,  lount 
the  number  of  entries  in  the  truth  table).  This  bounds  the  total  number 
of  distinct  functions  f^.  which  in  turn  bounds  the  width  of  the  OBDD 
representing  /  at  level  i.  We  know  that  IV  )  <  ivj  and  |Z|  <  uv.  Thus, 
the  overall  OBDD  size  is  bounded  by  n  ■  □ 


This  Ixuiiul  is  linear  in  the  number  of  inputs,  exponential  in  the 
forward  width  and  doubly  exponential  in  the  reverse  width.  The  double 
exporu'ntial  an|iears  to  l)e  necessary.  This  can  be  shown  using  the 
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■‘hidden  weighted  bit  ’  function  of  [BryOl]  as  a  counterexample.  This 
circuit  can  be  ordered  in  such  a  way  that  between  any  two  inputs  there 
is  a  cross  section  with  O(log2n)  wires  in  each  direction,  yet  there  is 
an  exponential  lower  bound  on  its  OBDD  size.  If  we  could  bound  the 
OBDD  size  with  a  single  exponential  in  both  the  forward  and  reverse 
widths,  the  OBDD  size  would  be  where  k  is  a 

constant. 

The  theorem  is  concerned  with  a  single  output  of  a  combinational 
circuit,  but  it  can  also  be  applied  to  the  transition  relation  of  a  sec(uen- 
tial  circuit.  To  do  this,  we  simply  transform  the  sequential  circuit  into 
a  combinational  circuit  which  computes  the  transition  relation  of  the 
sequential  circuit.  This  is  done  by  adding  a  pair  of  inputs  v,  and  v[ 
to  represent  the  old  and  new  values  of  each  state  component.  .Since 
the  transition  relation  of  the  circuit  is  the  conjunction  of  the  transition 
relations  of  its  components,  we  can  do  this  while  increasing  the  width 
of  the  circuit  by  only  one  wire  in  the  forward  direction  as  depicted  in 
figure  2.12.  Thus,  for  bounded  width  sequential  circuits  (even  with 
wires  in  both  directions),  the  size  of  the  OBDD  representing  the  tran¬ 
sition  relation  is  linear  in  n,  -f  n,,  where  n,  is  the  number  of  inputs  and 
n,  is  the  number  of  state  components.  The  synchronous  arbiter  cir¬ 
cuit  and  the  DME  circuit  of  the  previous  section  provide  experimental 
confirmation  of  this. 

We  have  shown  for  a  certain  structural  class  of  circuits  that  the 
representation  of  the  transition  relation  is  linearly  bounded  in  the  size 
of  the  circuit.  We  should  note  that  in  the  symbolic  model  checking 
algorithm,  we  also  use  OBDDs  to  represent  the  set  of  states  labeled  with 
a  given  (.TL  formula,  t 'nfortunately.  we  cannot  expect  to  polynomially 
l)ound  the  size  of  the  OBDDs  representing  these  sets  l)ased  purely  on 
structural  considerations.  I’lie  simplest  example  of  this  is  probably  a 
circuit  that  inputs  a  binary  number,  stores  one  copy  of  it.  then  serially 
rotates  the  original  by  an  arbitrary  number  of  bits.  This  circuit  has  the 
simplest  structure  we  might  hope  for  that  has  any  communication  at 
all  between  the  components,  yet  there  is  no  variable  order  which  yields 
a  compact  OBDD  for  the  reachable  state  set  of  this  circuit,  since  it 
implements  the  rotate  function.  The  same  argument  would  apply  to 
a  serial  multiplier  circuit.  In  general,  if  a  circuit  computes  a  function 
s<'rially  ^vliich  cannot  be  rc'presented  by  a  compact  OBDD.  t  lien  w<> 
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Figure  2.12:  Computing  a  conjunctive  transition  relation 

cannot  expect  the  symbolic  model  checking  algorithm  using  OBDDs  to 
be  efficient. 

2.5.2  Bounded  tree-width  circuits 

In  the  previous  section,  we  considered  tlie  OBBD  representation  of 
circuits  whose  gales  can  be  arranged  in  a  sequence  with  a  bounded 
number  of  wires  in  each  cross  section.  .\’ow  we  consider  the  slightly 
more  general  class  ot  circuits  which  can  be  can  be  arranged  in  a  tree 
with  a  bounded  width  property.  This  is  not  to  say  that  the  topology  of 
the  circuit  must  be  a  tree:  rather,  it  must  be  possible  to  lay  a  spanning 
tree  over  the  circuit  in  such  a  way  that  the  width  of  the  circuit  across 
any  arc  of  the  spanning  tree  is  bounded.  This  notion  of  bounded  tree- 
width  is  (h'tined  as  follows. 

Let  /'  =  {G.<)  l)e  a  tree  order  over  the  gates  of  a  circuit,  where 
<j'  <  <j  iff  (/  is  a  descendant  of  ry.  Let  h  be  the  branching  degree  of  T  ( te.. 
the  maximum  number  of  children  of  any  gate).  .\s  before,  the  forward 
cross  section  at  node  yy  is  tlie  set  of  wires  connecting  an  output  of 
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and  an  input  of  52  such  that  (j\  <  (j  and  g  <  g2.  Similarly,  the  reverse 
cross  section  of  T  at  node  y  is  the  set  of  wires  connecting  an  output  of 
and  an  input  of  such  that  g2  ^  g  and  g  <  g\.  The  forward  width 
of  the  tree  Wf  is  the  size  of  the  largest  forward  cross  section,  while  the 
reverse  width  Wr  is  the  size  of  the  largest  reverse  cross  section. 

For  the  moment,  let  us  consider  the  case  Wr  =  0.  and  let  the  width 
w  stand  for  the  forward  width: 

Lemma  2  For  any  topological  tree  order  T  =  (G,  <),  with  width  w  and 
branching  degree  6  >  1.  there  i.s  a  topological  linear  order  L  =  {G.  <'), 
with  width  w'  <  w{b  —  [)tog2\G\. 

Proof.  By  induction  over  jCj.  the  number  of  gates.  The  base 
case.  |Ct|  =  1.  is  trivial,  .\ssume  the  theorem  holds  for  all  circuits  of 

size  less  than  1G|.  Let  g  be  the  root  of  the  tree,  and  let  G'l . GV- 

be  the  subtrees  of  the  root,  where  k  <  6.  and  |Gi|  <  •  •  •  <  IGfej.  By 
inductive  hypothesis,  there  e.xist  linear  orders  L,  =  (G',,<,)  of  width 
Wi  <  w{b  —  l)/o^2|Gil,  for  all  [  <  i  <  k.  Let  L  =  (G',  <')  be  the 
extension  of  these  orders  such  that  G't  <'•••<'  G'l  <'  g,  as  depicted  in 
figure  2.13.  The  width  w'  of  L  is  bounded  by  maxi<,<i.( ly,  +  (k  —  i)to). 
Therefore,  for  some  t. 


iv'  <  w, {k  —  i)w 
In  the  case  k  =  i.  we  have 

ic'  <  (I'i;  <  icib  —  \.  ]log2\Gk\  <  <t;(h—  \  )log2\(j\ 

Otherwise.  /  <  k  and 

to'  <  IV(  k  -  l  +  (b  -  1  )  log2  IGr,  I  ) 

<  (c(/>  -  1  i  log2  (2^|G',|) 

Here,  we  note  that  K'r|  G  i'^i<j<k\G  j\)/{k  —  I  +  i  )  <  |fr|/{  /  -i-  1  ).  SO 

ir  G  -  1  )  log2  { 
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Figure  2.13:  .Arrangement  of  bounded  width  tree 

•  ^ 

We  note  that  since  t  >  1  and  A:  <  6.  A:  —  /  <  6  —  1.  Theretore  2^-*  <  2. 
Further,  .since  i  <  k.  fc  —  i  +  I  >  2.  Thus  <  !•  Theretore. 

w'  <  t(;(6  —  1 )  log2  ICt'I 


The  theorem  says  that  from  any  topological  tree  order  of  width 
n:  We  can  derive  a  linear  order  of  width  tr'  <  ir{h  —  i  )log,  |0'|.  It 
follows  bv  the  previotis  theorem  that  the  OBDD  size  is  bounded  by 
u2"’  £  U’- 1 1  ioi?2  :=  where  n  is  the  number  of  primary 

itipiits.  I'his  bound  is  polynomial  in  the  size  of  the  circuit  for  a  tixed 
width  ;md  branching  factor. 

Xow  we  turn  to  the  question  of  tree  orders  that  are  not  topological 
((>..  l)outi<le(l  tr<'e-width  circuits  with  both  forward  and  reverse  wires). 
In  this  case,  a  logarithmic  bound  on  the  width  of  the  litiear  order  L  is 
not  sulficient.  because  the  OBDD  size  can  be  doubly  expoiunitial  in  the 
numlx'r  of  /MYc.st  wires. 

We  cati  still  obtain  a  ()olynomial  bound  in  u.  however,  by  converting 
a  tree  ordered  circuit  with  ix'verse  wires  into  a  functionally  ('(piivalent 
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tree  ordered  circuit  witli  oidy  forward  wires: 

Lemma  3  [fT  =  (G.  < )  /.s  <i  tire  order  over  a  circuit  computing  func¬ 
tion  f.  with  forward  width  iCf  and  reverse  width  Wr.  then  there  i.i  a 
circuit  computing  f  with  topological  tret  order  T'  =  [G' .  <')  of  forward 
width  (c'j  <  t0f2"^'. 

Proof.  Consider  //.  a  sul)tree  rooted  at  gate  h.  letting  V  be  the 
forward  cross  section  at  h.  and  Z  the  reverse  cross  section  at  /;.  Let 

h\ . /ifc  be  the  children  of  /;,  and  let  V'l . V];  and  Zi . Z*.-  be 

their  respective  forward  and  reverse  cross  sections.  This  situation  is 
depicted  in  figure  2.14.  Let  the  output  functions  computed  by  //  be 

>  =  /(Z.v, . V,.) 

and  for  1  <  ;  <  h.  let 

Z,  =  r,(Z.  Vi . Vi;) 

>i  =  ffZ,) 

We  show  by  induction  over  1//|  that  there  exists  a  tree  circuit  H'  of 
forward  width  w'j  <  u'f2'^'''  and  reverse  width  le'  =  0.  computing  the 
functions 

/;  =  ./Li'-  )  i . )c).  for  ,r  €  {().  1 

.Note  that  is  simply  rcjw  c  in  the  truth  table  for  V  .  Since  there  are 
2'^'  possible  values  ot  .r.  and  /,  b<'J’  1)^  I  components,  the  numbf'r  of 
outputs  of  //'  is  j )  |2''^', 

By  inductive  li\ pot  hosis.  tliere  exist  circuits  //'  for  1  C  '  L  h. 
satisfying  the  width  bound  atid  coinputing  the  functions 

fu-  -  ffr).  for  .1-  {0,  1 

Now.  let  h'  be  a  gate  <<)m|)uting  /j-  according  t,o  the  lollowing  systtun 
of  equations: 


f,  =  . fl:rG 

I',  -  c  I  ft, , . I-  for  I  _  '  f 
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Let  H'  l)e  tlie  tree  ordered  circuit  obtained  by  taking  //  as  the  root. 

and  H[ . //(.  as  the  cliildren  of  the  root.  The  reverse  width  at  the 

root  is  0.  since  J\j.  does  not  depend  on  Z.  and  the  Forward  width  at 
the  root  is  Hence,  ttsing  the  inductive  iiypothesis.  ic'  =  0  and 

lu'f  <  IV fT"''.  If  h  is  the  root  node  of  G.  then  H'  computes  the  same 
function  as  G.  □ 

This  gives  us  the  following  theorem,  bounding  the  OBDD  size  for 
tree  ordered  circuits  with  both  forward  and  reverse  wires: 

Theorem  8  Ij  a  circuit  G  computinij  function  f  has  jonrnrd  ivuith 
iVf  and  reverse  width  tVr  for  some,  tree  order  T  of  branchinij  degree 
h  >  \ .  then  there  is  an  OBDD  representing  function  f  oj  size  bounded 
bgnltl'-"''  ''I'’-'!  where  it  is  the  numbe  r  oj  primary  inputs  of  the  circuit. 

ISvoj.  .\ccording  to  lemma  -L  for  any  tree  ordenal  circuit  of 
forward  width  tvi  and  reverse  width  tv...  we  can  construct  a  topological 
tree  orderetl  circuit  ot  width  tv  <  iVf'l'''''.  which  computes  the  same 
[unction.  Bv  lemma  2.  this  circuit  has  a  topological  linear  order  L  of 
wiflth  at  most  tv'  <  iv{h—  \  )log2\G\.  By  theorem  7.  there  is  an  OBDD 
for  tlu-  circuit  (^f  size  bounded  by 

llT'  '  <  K’i 
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Hence,  in  the  case  of  bounded  tree  width  circuits  (of  a  fixed  branch¬ 
ing  degree),  we  also  find  that  the  OBDD  size  can  be  bounded  poly- 
nomially  in  the  size  of  the  circuit.  In  this  case,  the  exponent  of  n  is 
related  to  both  the  width  and  the  branching  factor,  (dearly,  for  this 
bound  to  be  of  any  practical  interest.  Wf  must  be  small,  and  Wr  must 
l)e  very  small.  Nonetheless,  the  theorem  demonstrates  a  more  general 
topological  class  of  circuits  with  asymptotically  compact  OBDDs  than 
was  previously  known. 


2.6  Mu- Calculus  model  checking 

The  Mu-Calculus  (ParT4]  is  a  logic  based  on  extremal  fixed  points  that 
is  strictly  more  expressive  than  CTL,*^  and  can  also  express  a  variety 
of  properties  of  transition  systems,  such  as  reachable  state  sets,  state 
equivalence  relations,  and  language  containment  between  automata.  .\ 
symbolic  model  checking  algorithm  for  this  logic  allows  all  of  these 
properties  to  be  computed  using  OBDDs  [BCM'*'90]. 

The  Mu-(.'alculus  augments  the  ordinary  predicate  calculus  in  two 
ways.  First,  it  allows  terms  to  stand  for  relations.  If  /  is  a  formula 
in  which  variables  .r  and  //  are  free,  then  /  characterizes  a  relation  - 
1  he  set  ot  all  pairs  (.r.//)  satisfying  /.  This  relation  is  denoted  in  the 
Mu-('alculus  by  t  h('  term  X.r.i/.j.  Second,  the  Mu-Calculus  allows  us 
to  fX[)ress  least  and  greatest  fixed  points.  If  r  is  a  term,  and  Y  is  a 
relational  (predicate)  symlxd.  then  r  is  said  to  l)e  formally  monotonic 
in  V  if  V’  always  occurs  under  an  even  number  of  negations  in  r.  In 
this  case,  r  has  least  and  greatest  fixed  points  with  respect  to  V  .  which 
are  rlenoted  //>'.  r  and  uY  r.  .\  fixed  point  of  r  with  respect  to  >'  is  a 
relaticui  which  yields  itself  when  substituteil  for  all  free  o<-currences  of 
V  in  r. 

■  'binersoii  and  I.ei  [hLSfi]  j^avf  .i  iiiodel  rlieckiiig  .algorithm  for  .a  somewh.at  ditfer- 
•  tit  version  of  flif  .\Iu-(  'alculus.  .atul  showed  that  iIutc  are  formulas  m  this  l(3i):u’  ih.at 
'  .iimot  fie  ex[)re.ssei|  iii(  Ih.  IIiie.  we  use  I  he  rel.atioiial  .\l  u-( '.d<'u  1  US  of  P.ar  k  [  I ’if  7  ll 
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A  structure  in  the  Mu-Calculus  consists  of  a  set  D  (the  domain),  a 

valuation  o  for  the  individual  symbols  {a.6.  c _ }  and  a  valuation  u.'  for 

the  relational  symbols  {A.  B.C. . . The  valuations  assign  an  element 
from  the  domain  to  each  individual  symbol,  and  a  set  of  n-tuples  from 
the  domain  to  each  relational  symbol.  The  meaning  of  an  n-ary  term 
r  is  a  set  of  n-tuples  which  we  will  denote  r[o.  v].  The  O-ary  terms  will 
be  called  simply  propositions,  and  denote  truth  values. 

The  terms  of  the  Mu-Calculus  are  the  least  set  such  that: 

1.  Every  relational  symbol  is  a  term. 

2.  It  r  is  an  ;;-ary  term  and  (t'l . t’„)  are  individual  symbols,  then 

r(  r, . r„ )  is  a  proposition. 

d.  If  Ti  and  are  n-ary  terms,  then  so  are  -r,  and  ( 'i  V  r.. ). 

1.  If  p  is  a  proposition,  and  v  is  an  individual  symbol,  then  Be.  p  is 
a  [jroposition. 

•5.  If  p  is  a  proposition  and  (ui . v^)  are  individual  symbols,  then 

Auj . I'n.  p  is  an  n-ary  term. 

6.  If  T  is  an  n-ary  term  and  Y  is  an  n-ary  relational  symbol,  where 
T  is  formally  monotonic  in  K.  then  pY.  t  and  uY.  t  are  n-ary 
terms. 

7.  The  usual  abreviations  are  used  for  A.  V.  ttc. 

The  semantics  of  Mu-Calculus  terms  are  defined  as  follows; 

1.  /Cfo.  <•’]  =  where  R  is  a  relational  symbol. 

2.  rlr, . r„)[o.  r-j  is  true  iff  (a(r,) . o(  r„  ) )  is  in  'fo.  rj. 

d.  |-'r|[o.  (;■]  =  P'*  -  r[o.  tr].  (  m  V  -j)[o.  C’]  =  m[o.  i;-]  U  r_>[o.  t’]. 

1.  (  Be.  p)[o.  t'l  is  true  if  for  some  ,c  G  D.  /.»[©(  (’  ^  .r).  (.■]  is  true. 

d.  (  Ar, . r,,.  p)[o.  (,']  is  the  set  of  n-tuples  {.v^ . r„)  G  P"  such 

that  p[oir,  ^  .r,).i.’]  is  true. 

b.  ip).  r)[o.  fj.  where  r  is  an  /;-ary  t('rm.  is  the  least  set  >’  C  P" 

such  that  =  "[o.c’fV  •!—  .‘s')].  (nV.  r)[c).  (.•]  is  the  gn'atest  such 
s' 
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2.6.1  Applications  of  the  Mu-Calculus 

The  Mu'Calculus  is  quite  expressive,  as  can  be  seen  by  the  following 
compendium  of  applications.  To  begin  with,  given  a  binary  relation  R. 
the  image  of  a  set  Q  C  D  via  R  is 

R(Q)  =  Ay.  a.r.  (  y )  A  Q(x)) 

The  set  reachable  from  Q  in  any  number  of  steps  of  R  (including  0)  is 

/?'(Q)  =  yy'.  (Qy  R(Y)) 

The  transitive  (irreflexive)  closure  of  the  relation  R  is 

R'^  =  /iV'.  [/?V  A.c.r.  3y.  (V'(.r.y)  A  V'(y.r))] 


CTL  and  fairness  constraints 

The  interpretation  of  the  operators  of  CTL  in  a  Kripke  model  {T>.  R.  L) 
can  be  characterized  in  the  .Mu-Calculus  as  follows: 

EXp  =  \x.  3y.  (R{x,y)  A  p{y)) 

EFp  =  pY.(pVEXY) 

EGp  =  uY.  (p  A  E.XY) 

Eiq  U  p)  =  fiY  ipV  iy  A  EXY  }) 

In  addition  to  these  standard  operators,  we  can  also  characterize  the 
(  I  L  operators  iiiuler /(t;7v/c.s.s  von.straint.'i.  A  fairness  constraint  in  its 
simj)lest  form  is  a  coiulilion  that  is  assumed  to  hold  infinit<’ly  otten 

along  all  computation  paths.  Such  conditions  can  l)e  usetl  to  enforc(' 

fair  scheduling  of  processes  and  access  to  resources.  They  are  not  di¬ 
rectly  expressible  in  CI  L.  since  the  tense  operators  E  and  G  cannot 
be  directly  combined.  Instead,  we  restrict  the  path  quantifiers  of  (TL 
to  apply  only  to  tho.se  paths  along  which  each  formula  in  a  s<'t  (  '  holds 
infinitely  often.  To  distinguish  these  constrained  path  quantihers  from 
ordinary  path  f|uanlifiers.  we  subscript  them  with  ('.  I  lms.  Acf.  where 
('  is  a  set  of  ('TL  fornndas  and  /  is  a  linear  formula,  means  that  for 
all  paths,  if  each  formula  of  ('  is  true  inhnit«'ly  oft<'n.  then  /  is  true. 
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Similarly,  the  Formula  Ecf  means  that  there  exists  a  path  such  that 
each  Formula  oF  (’  is  true  infinitely  often  and  /  is  true.  Here,  we  con¬ 
sider  only  the  CTL  operators  with  existential  path  quantifiers,  since 
the  operators  with  universal  quantifiers  can  be  derived  from  these. 

The  Formula  ErGp  is  true  when  there  is  some  path  in  which  p  is 
true  in  every  state,  and  each  element  of  C  is  true  infinitely  often.  Let 

r[V’]  =  p  A  EX  f\  E{Y  i  [Y  A  c)). 

ctC 

We  argue  as  Follows  that  EcGp  is  the  greatest  fixed  point  of  r.  First, 
if  V  is  a  ti.xed  point,  then  every  state  in  V  satisfies  p.  and  Further,  hcis 
a  nontrivial  path  remaining  in  Y  which  leads  to  a  state  satisfying  each 
Fairness  constraint.  Hence,  a  looping  path  can  be  constructed  satisfying 
each  infinitely  often  without  exiting  Y.  Thus  V  C  EcGp.  On  the  other 
hand,  suppose  >'  =  EcGp.  Since  every  state  in  V  has  a  path  touching 
each  Fairness  constraint  infinitely,  as  does  each  state  along  that  path,  it 
follows  that  every  state  in  V’  can  reach  every  Fairness  constraint  without 
exiting  V’.  Thus  V  C  r[y'].  Therefore.  EcGp  is  the  greatest  fixed  point 
of  T.  The  set  of  states  satisfying  EcGp  is  expressed  in  the  Mu-Calculus 
as 

u)'.  ip  A  EX  /\  EiY  r  (V  A  c))) 

cec 

The  r<’maining  operators  under  fairness  constraints  can  be  <-haracterized 
in  terms  id'  EcGp.  as  follows: 

ErXp  =  E.\{p  A  EcG  true) 

E(  ■  Fp  ~  E  F{  p  A  E( -Gy  rue ) 

Ec{<i  (  p)  =  Eiij  I  ip  A  Ec(j  true)) 

Emerson  and  Lei  [ELSG]  give  a  characterization  in  the  .\Iu-( 'alculus  of 
CTL  under  a  more  g<'neral  class  of  Fairness  constraints.  Each  constraint 
in  this  scheme  recjuires  that  one  condition  liolds  infinitely  often  or  a 
second  c(jiidition  holds  finitely  often  (for  example,  idther  acknowledge 
holds  iidindelv  often,  or  recpu'st  holds  finitely  often). 
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Simulation  relations 

Two  states  .r  and  //  ot  a  Kripke  stnielure  are  said  to  be  hisiinular  it: 

1.  .r  and  /y  agree  on  the  atomic  propositions, 

■J.  ('very  successor  ot  r  is  hisiinular  to  a  successor  ot  //  and 

d.  every  successor  ot  //  is  bisirnular  to  a  successor  ot  .r. 

Two  states  are  ttisimular  it'  and  oidy  if  they  satisfy  the  same  set  of 

CTL  formulas  [BCdSTl.  If  lui.i/j . <H.)  are  the  atomic  propositions. 

then  the  l)isimulation  relation  can  lx*  <‘xpressed  in  the  Mu-Calctdus  as 
follows: 

=  //>'.  A.c. //,  f\  (<i,(.f)  (i,(>j)) 

AV.r'.  (/{(.r..r')  =>  3/y'. (/?(«/. //')  A  V(.r'./y'))) 

AV//,  [Riy.i/)  =>  ^  Y[.r'.t/)))) 

where  we  have,  as  usual,  identified  each  atomic  proposition  with  the 
'^et  of  states  in  which  it  is  true.  There  is  also  an  asymmetric  iu)tion  of 
'imulation  we  sav  that  .1  >tate  /•  'simulates  a  state  y  if: 

I.  r  and  7  aiiiee  on  the  <itomic  propositions. 

J.  ('\er\’  >n(  ('es'-or  ol  7  ''imulate<l  l)\  a  successoi'  ol  r. 

It  'tale  ./•  simulates  >tatc  7.  then  7  satisfies  ev<*ry  formnla  'atistied  bv 
I'  in  a  diah’ct  ot  ( '  I  I.  called  which  allows  oiil\'  unirersal  path 

(luantiliers  [(Ibl)ll."’  lestintt  bisimiilal  ion  and  simulation  relations  can 
be  us('<l  as  ,\  lorm  ol  \<'ritic<ition.  or  it  c;ui  be  used  to  ti'st  abst  r;\ct  ions 
ns('(l  in  composit  ional  model  checkinu  technitpies  ]( 'I.Mstla.  ( 11.1)  1  .  The 
'.line  idea  can  easil\  lie  evitnuh'd  to  svsteins  with  labeled  transitions. 

In  tail .  I  Ills  I'  hImi  inii'  Imt  (  ri, '  ui  l•xt^•lls|r)ll  oft  ''!'I,  wlinli  alli  avs  n  tin  si  riiiisl 
lln'■:u■  n  iii(»»ral  I'oriiinlas  i"  )"•  i  i  l>y  |>.al)i  ‘|ii;int ili'-rs 
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Language  containment 

The  Mu-C'alculus  can  can  express  the  relation  of  language  containment 
between  two  deterministic  ^-’-automata.  For  the  sake  of  simplicity,  we 
consider  only  deterministic  Biichi  automata,  which  are  not  complete 
for  the  class  of  i*.’-regular  languages,  but  it  is  not  substantially  more 
difficult  to  handle  more  general  classes  of  deterministic  automata,  such 
as  Street  automata. 

A  finite  deterministic  Biichi  automaton  consists  of  a  set  of  states 
K.  an  initial  state  po  €  A  .  an  alphabet  H.  a  set  of  transitions  A  C 
K  X  E.  X  K .  and  an  acceptance  set  B  C  K .  The  transition  relation  is 
such  that,  for  any  state  p  and  symbol  (t.  there  is  exactly  one  q  for  which 
Nip.a.q).  The  automaton  accepts  an  infinite  sequence  rr  G  iff  the 
sequence  of  states  p.  where  Alp,,  <t,,  p,^i)  holds  for  all  i.  passes  through 
the  acceptance  set  B  infinitely  often.  The  set  of  sequences  accepted  by 
an  automaton  M  is  called  the  language  of  M  and  denoted  C{M). 

To  determine  whether  the  language  of  a  Biichi  automaton  M  is 
contained  in  the  language  of  a  Biichi  automaton  M'  (with  the  same 
alphabet),  we  define  a  Kripke  structure  representing  the  product  of  M 
and  M'.  and  write  a  formula  in  CTL  which  is  true  if  and  only  if  every 
sequence  accepted  by  M  is  also  accepted  by  M'  [CDK90].  This  formula 
can  be  evaluated  using  its  Mu-Calculus  characterization. 

The  product  is  defined  by  its  transition  relation  R,  and  set  of  initial 
states  .^'o-  Let 

1.  /?  =  ,\.s.. s'.  3cr.  (A(.s.(T.r)  A  A'(.s'.a.c')). 

2.  ,S)  =  A.s.  .s'.  ((,s  =  Po)  A  (.s'  =  p,',)). 

There  is  a  sequence  in  the  language  of  M  but  not  in  the  language 
of  M'  if  anil  only  if  there  is  an  path  of  the  product  passing  through 
B  inlinitely  often,  but  not  through  B'  infinitely  often.  Thus.  C{M]  C 
£(.\/')  iff 

.^0  =>  e(s)( FA.s. .s'.  B'{.2) 

.\noth<’r  jiossible  approach  to  the  language  containnK'iit  juoblem 
makes  use  ol  the  transitive  closure  of  the  transition  relation.  First,  we 
remove  trom  the  proiluct  structure  all  transitions  that  begin  or  <'nd 
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with  a  state  in  B'.  Tliat  is.  let 

r  =  A.s..s'.r.  A  ^B'(/)  A  -B'(r')] 

The  transitive  elosure  of  this  relation  is 

~  .s',  /■.  /•'[T’l  .s,  .s'.  /•.  )  V  n'[(^(.s.  .s',  ti.  //')  A  (^(  n.  it',  r.  r')]]] 

This  is  the  set  of  all  [)airs  l.r.///  of  slates  of  the  ()rodiict  such  that  .r 
can  reach  7  without  passing  through  B' .  This  hokls  for  the  pair  l.r..i  l 
if  anti  only  if  .r  is  on  a  cych'  not  passing  through  B' .  If  there  is  any 
such  .r  in  B.  anti  r  is  reachable,  then  there  is  a  path  passing  through 
B  but  not  B'  inhnilt'lv  oft<'n.  hence  there  is  a  sequence  in  C(M).  but 
not  in  C{M').  The  converse  is  also  true.  Hence.  Ci.M)  C  CiM']  if  anti 
only  if  -'£’FAs..s'.  (  7’'''(.s.  .s'. a  B{.->)).  The  EF  operator  can  also 
be  evaluated  using  the  transitive  closure,  since 

EFp  =  Ax.  (/t(x)  V  37.  (R'*’(i.ij)  A  p(y))) 

2.6.2  Symbolic  algorithm 

By  devising  a  symbolic  model  checking  procedure  for  the  Mu-Calculus, 
we  can  quickly  establish  symbolic  algorithms  for  all  of  the  above  prop¬ 
erties.  If  we  assume  that  the  domain  is  Z?*.  a  symbolic  model  checking 
.dgorithm  is  easily  establishetl,  by  translating  iormulas  into  a  Boolean 
Mu-( 'alculus  where  the  domain  is  just  B  =  { fal.se.  Inn*} .  This  is  ilone 
by  replacing  <'very  individual  symbol  a  l)y  a  A-luple  of  individual  sym- 
l)ols  i  U].  uj.  .  .  .  (It,  }.  1  hus.  everv  n-arv  term  translates  to  a  Lii-iwv  term. 
In  tin'  Boolean  .\lu-( 'alculus  we  can  rej)res<'nt  t<'rm.s  by  Boolean  for¬ 
mulas  by  introducing  a  new  set  <d  dummy  intlivitlual  symbols  ,  , . 

to  represent  relational  parameters.  .\n  n-ary  term  r  is  represented  bv 
a  formula  ru.-j  such  that 

I  t'l . /•.,  I  ~  7!0.  '•] 

df 

7  0.’j{  ((  —  -  a  (M.  A  —  01  /))...  ■ —  .Cl . (/„  • —  I 

(  dven  I.',  we  ran  (  (juipule  t  he  formula  repres<‘nt imr  a  t<'rm  in  the  Boolean 
Mu-('<dculus  l)\-  rrruisiou  o\ei-  its  structure,  as  lollows; 
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1.  The  value  of  relational  variable  .-I  is 

2.  The  logical  connectives  and  quantifiers  are  evaluated  by  the  cor¬ 
responding  QBF  operations. 

The  value  of  an  n-ary  term  A(ni,  r  is 

r [«'](<>!  i—  d I . L\  (In) 


4.  The  value  of  the  proposition  r(i'i . t\).  where  r  is  an  n-ary 

term,  is  r[(/’](di  «—  I’l . d„  *—  ?•„). 

■).  The  n-ary  relational  terms  /(V.  r  and  uY.  r  are  evaluated  using 
the  standard  fixed  point  algorithm. 

Because  the  variables  are  Boolean  valued,  we  can  implement  ail  of  the 
above  using  the  operations  of  QBF.  with  OBDDs  as  our  representa¬ 
tion.  The  symbolic  .\Iu-Calculus  model  checking  algorithm  is  shown 
in  pseudo-code  form  in  figure  2.15.  Using  this  algorithm,  any  quantity 
that  can  be  characterized  in  the  Mu-Calculus  can  be  computed  using 
the  symbolic  model  checking  technique,  with  the  possibility  that  a  com¬ 
binational  explosion  can  be  reduced  or  avoided.  This  also  allows  us  to 
use  the  expressive  powers  of  the  Mu-Calculus  in  describing  and  manipu¬ 
lating  symbolic  algorithms,  with  the  understanding  that  the  translation 
from  .Mu-Calculus  to  a  symbolic  algorithm  is  merely  mechanical. 


2.7  Computing  equivalence  relations 

In  this  section,  we  consider  the  problem  of  computing  a  symbolic  rep¬ 
resentation  of  the  etiuivalence  relation  between  the  states  of  two  finite 
state  machines,  or  between  states  of  the  same  machine.  In  the  former 
case,  the  relation  can  be  used  to  determine  the  e(|ui valence  of  the  two 
machines,  while  in  the  latter  case,  as  Lin  it  al.  have  ob.served  [LT.N90]. 
the  self  ef|uivalence  relation  can  be  used  in  optimizing  the  logic  or  reg¬ 
ister  usage  of  the  machine. 
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function  eval(  r.  /r) 
case 

T  a  relational  variable;  return  r'fr) 

7  =  return  -cvall/j.  i.’) 

T  =  pV  q:  return  oval(/>.  c)  V  eva|(f/.  /.•) 

7  =  3w.  q:  return  Bte.  eval(/j.  17) 

7  =  p}'.  p:  return  fixedpoint(  V./ac’! K  *—  false)) 

7  =  i/V'.  p:  return  Hxedpoint( V’.p,t>( V’  *—  true)) 
end  case 
end  function 

function  fixeclpointl  V./;.  c) 

V'  =  evall  p.  I.’) 

if  V'  =  r()')  then  return  )  ' 
else  return  lixedpoiutl  1  .p.i.'{)'  *—  V')) 
end  function 

I’  isrure  J.l  ');  Sx  iiibolic  .\ln-(  alculus  nuxlel  i  heckini!,  abiorit  liti 
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2.7.1  State  equivalence 

VVe  use  a  standard  notion  of  the  equivalence  of  states  of  finite  Mealy 
machines.  Two  states  are  equivalent  if  and  only  if  for  all  input  se¬ 
quences.  they  yield  the  same  output  sequence.  The  following  is  an 
alternate  characterization:  equivalence  is  the  greatest  relation  between 
states  such  that  if  .r  is  equivalent  to  //.  then  for  all  inputs,  the  output 
in  state  .r  is  equal  to  the  output  in  state  ij.  and  the  successor  state  of 
X  is  e(|uivaleut  to  the  successor  state  of  //.  Let  d{x.:)  l)e  the  function 
which  determines  the  next  state,  as  a  function  of  current  state  x  and 
current  input  r.  and  let  7(.r.c).  be  the  function  that  determines  the 
current  output.  In  the  .Mu-Calculus,  the  equivalence  relation  is 

R,  =  uR.  \x.!i.  Vc.  (-(.r.-)  =  7(1/.-)  A  R{6(x.  :).S(y.  z)))  (2.22) 

(’sing  the  standard  fixed  point  approach,  we  can  evaluate  this  relation 

by  a  sequence  of  approximations  Rq,  Ri . where  R,  is  the  set  of  state 

pairs  which  are  equivalent  for  all  input  sequences  of  length  i.  This 
sequence  is  characterized  by  the  recurrence 

R,  =  A.r,t/.  Vc.  (7(.r.--)  =  7(t/,--))  (2.23) 

and 

/?,+,  =  Xx.!/.  Vc.  (/?,(.r..?/)  A  /?d<5(x.--)./)(/y,c)))  (2.24) 

This  is  simply  the  standard  0{n~)  algorithm  for  computing  state  equiv¬ 
alence  of  .Mealy  machines.  The  problem  of  determining  whether  two 
Mealy  machines  are  equivalent  in  their  initial  states  can  be  approached 
in  two  wavs  either  their  equivalence  relation  can  be  computed,  or  the 
state'  space  of  their  jtroduct  can  be  exhausted  by  a  forward  search.  The 
number  of  iterations  required  for  the  tormer  approach  can  be  substan¬ 
tially  h'ss.  however.  In  the  trivial  case  of  an  n-bit  counter,  the  number 
of  iterations  in  the  forward  search  is  is  exponential  in  n.  while  one  step 
suffices  to  reach  a  Hxed  point  in  the  equivalence  calculation,  since  all 
states  are  distinguished  by  their  outputs. 

It  is  immediately  seen  tliat  the  crucial  step  in  calculation  2.21  is  the 
substitution  of  vector  functions  A(.r.-I  and  diij.z)  into  R,.  The  most 
obvious  wav  to  accomplish  this  is  to  use  Bryant's  Coniposf  algorithm. 
Sonu'otl'.:  r  [)ossil)le  methods  are  intro<luced  in  this  section.  Computing 
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the  OBDD  representation  tor  a  composition  of  functions  is  an  NP-hard 
[jroblem  (cf.,  section  2.8.4).  thus  we  expect  no  good  general  solutions 
to  the  problem.  Another  tractability  issue  is  whether  the  approxima¬ 
tions  to  the  equivalence  relation  can  be  compactly  represented  using 
OBDDs.  There  is  no  guarantee  of  this,  of  course,  but  there  is  some 
reason  to  believe,  a  priori,  tliat  it  may  often  be  the  case.  First  of  all. 
for  single-machine  erpuvalence.  if  all  ^listinct  states  are  distinguishable, 
then  the  eriuivalence  relation  is  the  identity  relation,  which  can  be  rep¬ 
resented  as  by  a  linear  size  OBDD.  provided  the  component  variables  of 
r  and  ij  are  interleaved  in  the  variable  ordering.  It  also  .seems  plausible 
that  the  equivalence  relation  will  often  l)e  simply  a  logical  conjunction 
of  independent  relations,  each  correspotiding  to  some  modular  compo¬ 
nent  of  the  system.  In  this  case,  the  OBDD  representation  will  also 
l)e  compact,  provided  the  variable  ordering  conforms  to  the  modular 
structure  of  the  machiiu!.  In  any  case,  we  will  see  examples  of  fairly 
complex  machines  whose  e<;|uivalence  relations  are  expressed  compactly 
in  OBDD  form. 

Algorithm  using  restrictions 

Because  of  the  basic  difficulty  of  computing  compositions  of  OBDDs. 
it  is  useful  to  have  some  restrictions  on  the  result  in  order  to  be  able 
to  solve  the  problem.  Fortunately,  the  decreasing  nature  of  the  series 
of  approximations  (h’fined  in  2.2  4  provides  a  constraint  on  the  result  of 
the  substitution,  ^irice  each  ap[)roximation  R,+i  is  strictly  contained  in 
/?,.  We  can  use  this  fact  by  rewriting  2.2  4  as 

R,  +  i  —'\.r.!i.  7  : .  i  S' .{ .r .  1/ ]  A  i  r.  :).  Pi !/.  :)  ]  f  /?, ) )  (2.2-')t 

wlieif.’  represents  t  lie  IS  >tri(i  operator  introduc<’d  !>y  ('uiuh'rt.  .Madre 
and  Berthet  1('B.\I8!)1.  I  his  operation  pro<luces  a  function  which  agrees 
with  R,{6( .r .  Pi  IJ. -j \)  over  tlie  set  /?,.  attempting  to  minimize  the 
OBDD  size.  The  restriction  can  l)e  used  to  varving  advantage,  depend¬ 
ing  on  the  algorithm  used  for  substitution. 

Iterative  abstraction  algorithm 

.\nolher  wav  to  |)rovide  a  restriction  on  I  he  e(|uivalenc('  relation  is  lirst 
to  liiul  I  lie  e(|uivalen<  I'  relal  ion  ol  an  abst  racted  machine.  We  i  hoosi'  I  he 
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abstraction  in  such  a  way  that  the  equivalence  relation  of  the  abstract 
machine  is  strictly  weaker  than  the  equivalence  relation  of  the  original 
machine.  Thus,  we  can  compute  the  equivalence  relation  of  the  abstract 
machine  Hrst.  and  use  it  as  a  restriction  in  computing  the  ecjuivalence 
relation  of  the  original  machine.  In  particular,  we  can  abstract  the 
machine  by  choosing  a  subset  V  of  the  state  variables,  and  at  each 
approximation  ((uantifying  out  the  remaining  variables  existentially. 
That  is.  let  I  he  the  complement  of  V'.  and  let 

/?i  = -U-./y.  3V'"'.  Vc.  (7(x.r)  =  7(.y.r))  (2.26) 

and 

/?|'^i  =  A.r./y.  3\-TVc.  i  R''))  (2.27) 

It  is  trivial  to  see  that  each  approximation  in  the  series  is  strictly 
weaker  than  the  corresponding  approximation  in  R.  It  follows  that  R^, 
the  greatest  Hxed  point,  is  weaker  than  R.^.  Therefore,  we  can  restrict 
the  entire  calculation  of  R.^  to  only  those  state  pairs  satisfying  R^.  In 
addition,  we  can  use  a  series  of  subsets  V’l  C  V'2  C  •  •  •  C  Vy.  where  14  is 
the  set  of  all  state  variables,  restricting  the  first  approximation  in  each 
series  /?''■  to  the  equivalence  relation  for  the  previous  subset.  Thus,  we 
let 

R.\'’  -  R^J-'  A  \x.y.  3V;.  V--.  (7(.r.r)  =  yiij.,))  (2.28) 

and 

=  Xx.,,.  317.  Vr.  (/?/(.r.v)  A  (R^ASix.  :))  i  r]A) 

(2.29) 

VVe  will  !(’ter  to  this  a.s  the  iterative  abstraction  algorithm  for  comput¬ 
ing  th<’  (’qni\alence  relation.  By  adding  only  a  few  variables  to  each 
successive  subset,  we  can  in  some  ca.ses  obtain  fairly  strong  restriction, 
which  allows  the  substitution  10  be  computed  more  efficiently.  In  other 
cases  the  e(|ui valence  relation  obtained  for  the  abstracted  machine  may 
l)e  trivial,  since  abstracting  out  the  variables  in  1"  may  result  in  all 
machine  states  appearing  ef|uivalent  at  the  outputs.  This  is  especially 
likely  if  the  abstracted  variables  hold  important  control  information 
tliat  enables  machine  registers  to  l)e  observed  at  the  outputs,  .\onethe- 
less.  we  can  show  cases  where  this  incremental  a|)proach  is  greatly  more 
efficient  than  the  basic  algorithm. 
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2.7.2  Methods  for  functional  composition 

riiis  section  considers  methods  tor  substituting  functions  for  variables 
in  OBDDs.  This  operation  is  referred  to  by  Bryant  as  Compo.se.  It  is 
the  syntactic  mechanism  corresponding  to  functional  composition.  .\s 
'uch.  it  has  a  number  of  applications  apart  from  finding  the  ecpiivalence 
relation  of  finite  slate  machines,  including  the  evaluation  of  CTL  for¬ 
mulas  [BF89bj.  Most  of  the  algorithms  presented  here  for  this  purpose 
have  been  modified  to  take  as  an  e.xtra  argument  a  restriction  on  the  re¬ 
sult.  in  the  hope  that  efficiency  can  be  obtained  by  combining  these  two 

o|)erations.  VVe  consider  the  problem  of  calculating /(r/i . ijn)  [  R. 

where  /,  (j\ . (j^  ‘'nd  R  are  all  Boolean  functions. 


*‘bottom-up”  substitution 

This  is  the  method  originally  proposed  by  Bryant  for  his  Compo.^e  al¬ 
gorithm.  but  with  a  restriction  on  the  result.  In  this  method,  we  view 
each  OBDD  node  in  /  as  a  gate,  which  computes  the  function  “if  e, 
then  h  else  1"  or  equivalently,  A  1)  y  (u,  A  li).  Having  substituted 
the  functions  (j\.  ■ . .  ,(jn  for  the  variables  in  /  and  h,  we  can  then  com¬ 
pute  the  result  for  /  using  the  standard  V  and  A  operators.  The  basic 
bottom-up  algorithm  is 

function  b(;t to!ii-up(  J.R) 
if  f  is  a  leaf  then  ret  urn  / 

if  bottom-upl  / ./)')  has  alreadv  been  solved  then  rt'turn  old  solution 
else  ./  is  a  lri[)le  (  c,.  j\.  /,■,  il 
/  =  bot  tom-upl  fi.  R  I 
Il  =  i)ottom-up(  J)^.  R) 
return  ( (  -ij,  A  1 1  /  \fj,  '  // 1 )  R 
end 


Note  that  the  restrictioti  operator  is  used  at  each  step  to  stmplib 
the  result.  Since  each  'Ulqtroblein  is  solved  oiil\'  onc(\  the  uitmbt'r  of 
o'cursuc  I  alls  to  bottom  uf)  is  '  I  '- 
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Domain  partitioning 

The  domain  partitioning  strategy  is  so  named  because  it  divides  the 
problem  into  two  subproblems  by  partitioning  the  domain  of  the  Func¬ 
tions  (ji . (jn  according  to  the  value  of  one  of  the  variables.  The 

operation  proceeds  in  several  steps. 

First,  we  observe  that  if  any  of  //, . (jn  are  constants,  we  can 

immediately  substitute  these  values  into  /.  since  substitution  by  a  con¬ 
stant  is  a  linear  time  operation  which  ran  only  reduce  the  size  of  the 
OBDD.  We  use  the  tact  that  if  g,  =  c.  where  c  is  0  or  1.  then 

lifJl . <Jn)  =  f(l',  c)(</i . (/„)  (2. .30) 

.\’e.xt.  we  observe  that  we  can  eliminate  any  argument  position  on  which 
/  does  not  depend,  thus  obtaining  a  smaller  problem  with  the  same 
result.  W'e  can  determine  the  set  of  variables  on  which  /  depends  in 
linear  time,  since  /  depends  on  v,  if  and  only  if  c,  appears  in  some  node 
in  /. 

If  at  this  point  the  function  /  has  been  reduced  to  a  constant,  we  are 
done.  Otherwise,  we  split  the  problem  into  two  cases  and  reciirse.  VVe 

choose  the  first  variable  u,  occurring  in  gi . g^.  and  apply  Shannon’s 

expansion,  obtaining  two  subproblems 

/  =  ./(i/il —  0) . /y,!!'’.  ^0)) 

^  1) . *—1)) 

A.s  in  (Utier  OBDD  algorithms,  the  result  is  an  OBDD  r  =  ic,. /./?). 
|)rovided  /  ^  h.  otherwise  r  =  I  =  h.  Nd'edh'ss  to  say.  we  use  a  hash 
table,  caching  the  results  ot  subproblerns  so  that  the  same  subproblem 
is  not  solved  twice.  W'^ith  caching,  the  <omplexitv  of  the  algorithm  is 

(K\f\  ^  n\!j,\)- 

Making  use  ot  the  restriction  H  in  this  algorithm  is  straightforward. 
If  H  =  0.  the  result  can  be  anv  lunction  at  all.  so  we  simply  return  0. 
Fach  time  we  partition  the  problem  into  subproblems,  we  al.so  split  R 
into  two  cases.  R{r,  —  0)  and  /?(c,  <—  1 ).  The  restriction  has  the  effect 
of  nittina  olf  the  lecursion  each  tinu*  a  0  leaf  is  reached  in  R. 
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Sequential  substitution 

This  is  perhaps  the  simplest  approach  to  substitution;  it  transforms 
a  simultaneous  substitution  problem  into  a  sequence  of  substitutions. 
This  is  done  by  replacing  each  variable  u,  in  the  OBDD  for  /  by  a  new 
variable  v[.  Having  done  this,  it  is  safe  to  perform  the  substitutions  of 

each  function  g,  for  n'  in  any  order,  since  none  of  the  functions  gi . 

depends  on  any  variable  being  substituted.  Substitution  of  a  function 
for  a  single  variable  can  be  accomplished  as  follows: 

fic  ^  ^y.)  =  3i’'.  [(c'  <^=4-  g^)  A  f]  (2.;31) 

This  approach  can  also  make  effective  use  of  a  restriction.  The 
restriction  operator  operator  may  in  fact  be  applied  after  each  substi¬ 
tution  step  if  desired,  potentially  reducing  the  size  of  the  intermediate 
results.  In  the  case  of  the  iterative  abstraction  algorithm,  the  fact  that 
some  of  the  variables  in  the  result  will  later  be  quantified  out  existen¬ 
tially  can  also  be  put  to  use.  VVe  can  move  the  existential  quantifiers  for 
these  variables  inside  the  conjunction,  thus  quantifying  the  abstracted 
variables  out  of  the  term  (y-  jfi)  before  applying  the  conjunction. 
This  may  weaken  our  result  somewhat,  since  [3i.  a]  A  [3x.  b]  is  weaker 
than  3x.[a  A  6],  but  it  can  produce  significant  reductions  in  the  size  of 
the  intermediate  results.  The  final  result  of  the  equivalence  algorithm 
is  unchanged,  since  it  is  computed  with  no  variables  abstracted. 

2.7.3  Experimental  results 

This  .section  presents  the  results  of  applying  the  various  eciuivalence 
relation  algorithms  tc;  several  example  state  machines,  with  a  ramte 
of  complexity.  The  resvdts  are  compared  to  published  results  for  the 
same  circuits  by  Touati  G  iil.  (computing  only  the  reachable  states) 
and  Lin  et  nl.  (computing  the  eciuivalence  relation).  In  all  ca.ses,  it 
is  self-equivalence  that  is  calculated.  It  would  be  interesting  to  have 
some  residts  in  tliis  section  on  calculating  the  state  eriuivalencc  relation 
between  two  different  implementations  of  a  given  machine,  but  unfor¬ 
tunately.  such  examples  w(>re  lacking.  The  three  ilifferent  approaches 
to  OBDD  substitution  are  compared,  for  each  example.  Where  pos- 
sit)le.  the  direct  algtaiilim  is  used,  otherwise,  the  iterative  .ibst  ract  ion 
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machine 

mtd 

result 
( nodes) 

b-u 

(secs) 

d-p 

(secs) 

seq 

(secs) 

Touati 

(secs) 

Lin 

(secs) 

sbc 

iter 

361 

2903 

cpb32 

iter 

95 

14.4 

54.4 

14.1 

12.10 

key 

iter 

167 

342 

1738 

884 

5706 

175.20 

minmax  10 

dir 

89 

204 

minmax20 

dir 

59 

3.0 

4.5 

minmaxdO 

dir 

89 

6.2 

9.0 

15.7 

Table  2.1:  Equivalence  calculation  times 

algorithm  is  used.  For  example,  the  state  equivalence  relation  for  the 
machine  sbc  was  calculated  using  iteration,  but  could  not  be  calculated 
directly.  Table  2.1  gives  the  total  e.xecution  time  in  seconds,  while 
table  2.2  gives  the  total  number  of  OBDD  nodes  used.‘'  The  latter 
numbers  are  not  very  reliable,  since  they  depend  to  some  extent  on  ar¬ 
bitrary  choices  about  when  to  scavenge  unused  cells  and  cache  entries. 
However,  if  the  available  memory  limit  of  190.000  nodes  is  exceeded, 
it  is  certain  that  all  of  the  nodes  in  use  were  necessary  for  the  com¬ 
putation.  since  all  available  nodes  were  scavenged  when  the  memory 
limit  was  reached.  The  columns  give  the  following  information;  the 
name  of  the  circuit,  the  method  used  (direct  or  iterative |.  the  size  of 
the  e(|ui valence  relation,  and  the  time  or  space  needed  for  each  of  the 
three  substitution  mefhotls  (bottom-up.  domain  partitioning,  and  se¬ 
quential).  riie  times  are  for  a  LISP  implementation  running  on  a  1 
MIP  minicomputer.  The  final  two  columns  give  the  results  ol)tained 
by  Luiati  ft  <il.  and  Lin  et  nl.  for  the  same  circuit.  These  times  are  for 
('  language  implementations  running  on  a  DEC  5400  and  IHM  116000 
respectively. 

It  wouhl  .seem  that  for  the  circuits  cpb.  key  and  minmax.  which  have 
regular  structures  with  no  control  registers,  there  is  no  clear  choice  as  to 

‘'.Actually,  fiitiction  i^raplis  witli  tieifafcd  arrs  were  ii.scfl  for  tlii.s  <'alciila- 
iioii  [ItilHTj.  liftice  I  lie  iiutnber  of  nodes  may  he  sliifhtly  smaller  than  what  would 
!<.-  ohiamed  iismi'  OBDDs. 
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machine 

mtd 

result 

(nodes) 

b-u 

(nodes) 

d-p 

(nodes) 

seq 

(nodes) 

sbc 

iter 

.361 

22184 

34609 

cpb32 

iter 

95 

8202 

4225 

11295 

key 

iter 

167 

13328 

24868 

11563 

minmaxlO 

dir 

89 

16589 

17815 

17566 

minmax20 

dir 

59 

85.38 

8492 

9190 

minmax30 

dir 

89 

11952 

11886 

10001 

Table  2.2:  Equivalence  calculation  space 


which  substitution  algorithm  is  best.  The  bottom-up  algorithm  tends 
to  provide  the  best  performance  with  the  least  memory  usage,  but  there 
are  a  number  of  exceptions.  The  machine  sbc,  which  is  somewhat  more 
complex,  is  a  more  interesting  Ccise.  Here  bottom-up  and  sequential 
both  provide  fairly  efficient  solutions,  although  the  iterative  method 
was  required  in  both  Ccises  to  solve  the  problem.  The  domain  partition¬ 
ing  approach  fails  to  terminate  after  10,000  seconds.  In  the  first  stage 
of  the  iterative  algorithm,  domain  partitioning  produced  over  100.000 
subproblems  for  a  final  result  of  approximately  100  nodes.  Obviously, 
many  different  subproblems  with  identical  results  are  being  solved.  The 
difficulty  is  that  there  is  no  easy  way  to  identify  equivalent  problems. 
It  is  worth  mentioning  the  the  limit  on  the  size  of  the  cache  for  this 
method  was  -3000  entries.  With  an  unbounded  cache,  the  performance 
of  the  algorithm  may  be  much  better  (a  matter  of  theoretical  interest 
at  best,  since  an  unbounded  cache  cannot  be  provided).  It  should  also 
be  noted  that  the  results  for  minmax  are  somewhat  anomalous,  since 
the  lO-bit  version  seems  to  be  substantially  more  complex  than  the  20- 
and  30-bit  cases.  This  is  explained  by  the  fact  that  the  output  func¬ 
tions  of  these  different  versions  were  not  the  same.  In  the  20-  and  30- 
bit  versions,  the  outputs  appear  to  depend  only  on  the  'last  ’  register, 
and  not  the  ■‘min'  and  "max  '  registers.  It  is  also  interesting  to  observe 
that  for  minmaxlO.  not  all  of  the  states  are  flistinguishable.  that  is.  the 
equivalence  relation  is  not  the  identity. 
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Comparing  these  results  to  those  of  Touati  et  al.,  it  is  interesting  to 
note  that  the  self-equivalence  relation  can  be  computed  in  less  time  than 
the  reachable  states  for  sbc  and  key  (taking  into  account  the  difference 
in  machines  speeds  of  roughly  a  factor  10,  the  equivalence  method  seems 
to  be  about  one  order  of  magnitude  faster  for  sbc,  and  two  orders  of 
magnitude  faster  for  key).  Of  course,  the  information  obtained  by  the 
two  methods  is  not  the  same.  It  seems,  however,  that  in  some  cases 
where  the  set  of  reachable  states  is  not  obtainable,  the  equivalence 
computation  may  still  provide  useful  information  for  logic  optimization. 
The  results  of  Lin  et  al.  seem  to  be  roughly  comparable  for  the  machines 
key  and  cpb32  (again,  taking  into  account  the  difference  in  machine 
speeds).  It  is  not  clear  from  the  Lin  et  al.  article  which  substitution 
method  was  used,  since  two  vvere  mentioned.  The  one  benchmark  for 
which  the  iterative  method  was  required  to  produce  a  result  was  sbc. 
but  unfortunately  Lin  et  al.  do  not  report  a  figure  for  this  machine. 
.\lso.  because  of  the  fact  the  the  20-  and  30-bit  versions  of  minmax 
had  modified  output  functions,  it  is  not  possible  to  compare  figures 
for  this  benchmark.  .\s  a  result  of  these  ambiguities,  it  difficult  to 
draw  conclusions  about  the  effectiveness  of  the  iterative  abstraction 
method,  except  to  say  that  in  one  c«ise  (sbc)  it  Wcis  the  only  method 
that  successfully  computed  the  equivalence  relation. 


2.8  Related  research 

The  author  first  experimented  with  the  use  of  OBDDs  to  represent 
sets  of  states  and  transition  relations  in  1987.  building  the  first  sym¬ 
bolic  model  checker  for  CTL.  Various  heuristic  improvements  to  the 
basic  technique  were  developed,  including  the  OBDD  algorithm  com¬ 
bining  existential  quantification  and  conjunction  (cf.  section  2.3.4).  and 
the  technique  of  early  quantification  for  disjunctive  transition  relations 
(cf.  section  2.4.2).  Extending  this  work.  Burch.  Clarke.  Long,  McMil¬ 
lan.  Dill  and  Hwang  described  a  symbolic  model  checking  procedure 
for  the  propositional  Mu-Calculus,  which  could  be  used  for  a  variety 
of  purposes,  including  CTL  model  checking,  testing  various  process 
equivalences,  testing  language  containment  of  a;-automata.  and  check¬ 
ing  satisfiability  of  LTL  formulas  [BCM‘''90].  In  1989.  the  author  used 
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the  model  checking  technique  to  verify  the  cache  consistency  protocol 
of  the  Encore  Gigamax  multiprocessor  (see  chapter  4).  In  the  process, 
a  model  checking  system  called  SMV  was  developed,  along  with  an 
associated  description  language  (see  chapter  3). 

In  1989,  the  idea  of  using  the  OBDD  representation  for  verification 
of  finite  state  machines  appears  to  have  been  independently  developed 
by  Coudert,  Madre  and  Berthet  [CBM89],  who  used  it  in  their  PRIAM 
system  for  testing  equivalence  of  finite  state  machines.  They  represent  a 
finite  state  machine  by  a  pair  of  vector  Boolean  functions.  The  function 
^(u,u;)  yields  the  next  state  vector  as  a  function  of  the  current  state 
vector  V  and  the  input  vector  w.  The  function  A(y.  w)  yields  the  output 
vector  as  a  function  of  v  and  iv.  The  equivalence  of  two  state  machines  is 
tested  by  creating  a  combined  machine  in  which  both  machines  receive 
the  same  input  vector,  and  the  output  is  a  single  bit  which  is  true  if  and 
only  if  the  output  vectors  of  the  two  machines  are  equal.  The  reachable 
states  of  this  combined  machine  are  computed.  If  in  all  reachable  states 
the  output  is  true,  the  two  machines  are  equivalent,  since  no  input 
sequence  can  produce  differing  output  sequences  from  the  two  machines. 

The  set  of  reached  states  is  computed  as  the  limit  of  an  increcising 
series  of  approximations,  starting  with  the  initial  state.  The  set  of 
states  reachable  in  one  step  from  a  set  5  is  computed  by  a  function 
called  Imag,  where  [mag(6.  S)  =  (s  |  3u,  w  :  v  ^  S.  (^(t;.  lu)  =  s}.  .Most 
of  Coudert,  Madre  and  Berthet 's  efforts  are  applied  to  computing  the 
Imag  function  without  resort  to  representing  the  transition  relation  as 
an  OBDD.  which  they  claim  is  generally  intractable.  Their  approach 
begins  by  reducing  the  problem  of  computing  the  image  of  a  set  via  a 
function,  to  computing  the  range  of  a  function.  This  is  done  using  an 
OBDD  operation  called  (.'onsfrnin.  The  (.’onstrnin  operator  lakes  two 
Boolean  functions  /  and  g.  and  returns  a  function  f  =  Constrnini  f.j^ 
with  the  following  property:  for  all  x'.  f'{x')  —  fix),  where  .r  is  the 
nearest  Boolean  vector  to  x'  (according  to  a  suitable  <listance  metric) 
such  that  (j{x)  =  t.  If  we  let  6'  =  Constramid.  .S).  then  the  image  of 
via  d  is  just  the  range  of  d' . 

(.'oudert  and  Madre  suggest  two  methods  for  computing  the  ramie  of 
d' .  The  first  is  called  range  partitioning.  In  this  approach,  we  pick  the 
lowest  remaining  variable  in  the  ordering  (call  it  r,).  and.  and  divid«' 
the  problem  into  two  subproblems,  depending  on  thecjurpiit  of  fiinctiou 


90 


CHAPTER  2.  SYMBOLIC  MODEL  CHECKING 


Thus. 

{Rangt{6')){vi  *— Q)  =  Rangt{Constrain{6' 

{Range{S')){vi  *— i)  =  Range{Constrain{b' ,6\)) 

Note  that  tor  any  function  /, 

Constrain{f,  f)  =  1  and 

Constrain{f.-'f)  =  0 

so  each  recursion  effectively  eliminates  one  component  function  of  S'. 
The  recursion  terminates  when  all  of  the  components  of  S'  are  constants. 

The  other  approach,  called  domain  partitioning,  is  to  divide  into 
subproblems  l)ased  on  the  value  of  one  of  the  inputs  to  6'.  Thus. 

Range{S')  =  Range(6'(vi  0))  V  Range(S'{i\  —  1)) 

Again,  the  recursion  terminates  when  all  of  the  components  of  S'  are 
constants. 

Both  of  these  strategies  are  special  cases  of  a  general  strategy  where 
one  chooses  a  cover,  which  is  a  pair  of  functions  h\  and  /ii  such  that 
hi  V  /iT  =  I,  and  then  computes  the  recursion 

Ran<je{6')  =  Rangei  Constraints' .h\))y  Ranged  Constrmni S' .  hC)) 

In  the  case  of  range  partitioning.  h\  =  d'  and  In  the  case  of 

domain  partitioning,  hi  =  c,  and  h>  =  ^i\.  It  is  suggested  tliat  other 
covers  may  be  useful  as  well.  .\s  with  other  OBDD  techniques,  a  table 
of  pairs  \  C .  llangetS'))  is  kept  to  avoid  solving  the  same  subproblem 
twice.  This  table  is  not  as  effective  as  the  in  the  case  of  the  standard 
OBDD  operations,  however,  since  the  number  of  possible  subproblems 
is  exponential  in  the  number  of  state  variables,  (i'oudert  and  .\Iadre 
suggest  several  optimizations  for  increasing  the  hit  rate  in  this  table. 

.A  further  optimization  introduced  b'-  (.'oudert  and  Madre  is  to  use 
an  OBDD  lunction  called  Restrict  to  reduce  the  size  of  the  reached  state 
set  before  applying  the  /m<uy  operator.  The  Restrict  operator  takes  two 
functions  /  and  g.  and  produces  a  function  f  =  Restncti  J.g)  such  that 
tor  all  it  </(.r)  =  I. then  }'{x)  =  otherwise  fix]  is  arbitrary. 
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Usually  (but  not  always),  the  size  of  f'  is  less  than  the  size  of  /.  VVe 
note  that  if  Ri  is  the  set  of  states  reachable  after  i  steps  of  the  machine, 
then 


=  Ri^  [mag{S,Ri) 

=  /?,  V  Imag{S,  Restrict{R,,-'R,-i)) 

As  a  result,  the  size  of  the  arguments  of  Imagczsv  sometimes  be  reduced 
using  Restrict. 

Coudert  and  Madre  report  experimental  results  for  computation 
of  the  set  of  reachable  states  for  a  variety  of  small  sequential  circuits 
(mostly  ISCAS^*  sequential  benchmark  circuits).  Computing  the  set  of 
reached  states  can  be  useful  for  generating  test  patterns  or  ‘‘don’t  care” 
condition  for  logic  optimization  [TSL'*'90].  Unfortunately.  Coudert  and 
Madre  do  not  use  their  techniques  to  actually  test  the  equivalence  of 
two  state  machines,  so  it  is  unknown  whether  the  technique  is  useful 
for  this  purpose.  They  have  not  studied  the  asymptotic  performance  of 
their  techniques  for  claisses  of  circuits,  so  it  is  not  possible  to  determine 
whether  their  optimizations  yield  «isymptotic  improvements. 

.A  variant  on  the  symbolic  model  checking  technique  for  CTL  weis 
proposed  by  Bose  and  Fisher  [BF89b|.  Their  technique,  which  is  lim¬ 
ited  to  deterministic  finite  state  machines,  also  represents  the  tran¬ 
sition  relation  of  the  machine  by  a  vector  of  Boolean  functions  6. 
and  uses  Bryant’s  Compose  operation  to  compute  EXp  =  p{c,  ♦— 
They  do  not  report  experimental  results  using  this  technicpie  for 
practical  circuits.  A  similar  technique  was  proposed  by  (.'oudert  and 
Madre  iCMBOl!. 

Other  researchers  have  proposed  techniques  to  avoid  constructing 
rhe  transition  relation.  For  example.  Burch.  Clarke  and  Long  use  early 
quantification  (cf.  section  2.4. ‘2)  for  both  disjunctive  and  conjunctive 
transition  relations  [BCL'Jlb.  BCL'Jlaj.  They  use  the  term  "partitioned 
transition  relations”  for  this.  The  technique  is  somewhat  limited  in  the 
•  ase  of  conjunctive  transition  relations,  because  e.xistential  quantifica¬ 
tion  only  distributes  <)ver  conjunction  in  the  special  case  when  one  of 
the  conjuncts  does  not  depend  on  the  variable  being  quantified.  N’ev- 
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ertheless,  there  are  cases  where  the  support  of  the  component  relations 
is  sufficiently  disjoint  to  make  this  technique  effective. 

The  basic  technique  is  the  following:  assume  we  wish  to  com¬ 
pute  3u.  Ai /m  where  v  =  (ui, . . . ,  is  a  vector  of  variables  and 
/  =  {f\i  ■  •  ■  ^  fm)  is  a  vector  of  Boolean  functions.  Since  conjunction 
is  associative  and  commutative,  we  can  combine  these  functions  in  any 
order  we  choose.  In  addition,  if  at  any  time  there  is  a  variable  occurring 
in  only  one  function,  we  can  quantify  that  variable  out,  since  3u;.  (pAq) 
is  equivalent  to  {3w.p)Aq  when  q  does  not  depend  on  w.  Since  quantifi¬ 
cation  tends  to  reduce  OBDD  size  by  reducing  the  number  of  variables, 
the  strategy  is  to  combine  the  functions  in  such  an  order  that  variables 
can  be  quantified  out  as  soon  as  possible. 

Burch  Clarke  and  Long  use  a  fixed  order  determined  by  the  user 
for  combining  the  functions.  They  show  that  this  is  quite  effective 
for  pipelined  data  path  circuits,  and  an  asynchronous  stack  circuit, 
improving  the  asymptotic  performance  as  the  circuit  size  increases.  For 
the  DME  circuit,  the  asymptotic  performance  of  this  method  was  not 
cis  good  as  a  method  using  a  disjunctive  transition  relation,  but  it  can 
be  more  efficient  for  small  rings.*®  It  Wtis  found  most  efficient  to  group 
the  components  of  the  transition  relation  and  combine  each  group  in 
advance,  thus  avoiding  some  computation  at  each  step. 

For  disjunctive  transition  relations  (interleaving  models).  Burch. 
Clarke  and  Long  introduce  a  modified  search  order  that  tends  to  reduce 
the  representation  of  the  reached  state  set.  In  a  breadth  first  search, 
the  representation  of  this  set  is  complicated  by  the  fact  that  the  after  n 
steps,  the  number  of  steps  taken  by  each  process  is  constrained  to  sum 
to  11.  This  produces  an  artificial  correlation  between  the  states  of  oth¬ 
erwise  independent  processes  (cf.  section  2.1.2).  To  counter  this,  one 
can  modify  the  search  order,  searching  first  all  of  the  states  reachable 
by  transitions  of  one  subset  of  the  system  processes,  then  the  next, 
and  repeating  this  process  until  a  fixed  point  is  reached.  This  tech¬ 
nique.  called  "modified  breadth  first  search",  wcis  effective  in  reducing 
the  OBDDs  representing  the  reached  state  sets  for  an  asynchronous 
stack  circuit,  but  was  found  not  to  be  as  effective  as  the  "conjunctive 
partitioning"  method.  For  the  D.ME  circuit,  the  modified  breadth  first 
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search  method  was  faster  up  to  about  16  cells,  but  had  slower  eisymp- 
totic  performance.  The  grouping  of  processes  into  subsets  was  manual. 

Another  OBDD  b«ised  technique  for  computing  the  reachable  states 
of  a  machine  was  introduced  by  Touati  et  al.  [TSL‘''90].  They  also 
use  a  conjunction  of  component  relations  to  represent  the  transition 
relation,  along  with  early  quantification.  However,  they  combine  this 
technique  with  the  Constrain  operation  of  Coudert  et  al..  This  reduces 
the  problem  of  computing  the  image  of  a  set  via  a  relation  to  that  of 
computing  the  codomain  of  a  relation.  A  series  of  approximations  Ai 
to  the  reachable  states  is  computed,  such  that 

A,+i  =  A,  V  \y.  3x.  (/\  Constrain{Rj,Ai)){x,y) 

j 

where  R  is  a.  vector  of  component  relations,  each  relation  determining 
the  new  state  of  one  state  variable.  Touati  et  al.  find  this  technique 
to  be  superior  to  using  the  transition  relation  directly  and  to  using  the 
Imag  operation  of  Coudert  et  al.  for  computing  the  reachable  states  of 
the  benchmark  circuits  minmax  and  sbc,  somewhat  slower  for  key,  and 
roughly  the  same  for  cpb.32.4.  It  would  be  interesting  to  know  for  the 
cases  where  an  improvement  was  obtained,  how  much  was  due  to  the  use 
of  Constrain  and  how  much  to  the  use  of  early  quantification.  Touati 
et  al.  have  also  suggested  partitioning  complex  next-state  functions 
into  the  composition  of  a  sequence  of  smaller  functions.  This  could  be 
useful  for  circuits  containing  multipliers,  or  other  functions  which  have 
no  compact  OBDD  representation. 

Touati.  Brayton  and  Kurshan  report  a  technique  lor  testing  lan¬ 
guage  containment  of  a;-automata  using  OBDDs  [TBK91].  They  use 
the  L-automaton  model  of  Kurshan  [Kur86].  and  an  algorithm  similar 
to  the  one  described  in  section  2.6.1  using  the  transitive  closure  of  the 
transition  relation.  No  experimental  results  using  this  technique  are 
available. 

.Another  way  that  one  can  test  equivalence  of  two  finite  state  ma¬ 
chines  is  by  computing  the  equivalence  relation  on  states,  as  described 
in  section  2.7.  Lin  et  nl.  also  describe  OBDD  based  algorithms  for 
computing  this  relation  [LT.N90].  .A  comparison  of  the  methods  can  be 
found  in  section  2.7.  Lin  et  al.  describe  how  the  equivalence  relation 
can  be  used  for  computing  ’vlon't  care"  conditions  for  logic  optiiniza- 


94 


CHAPTER  2.  SYMBOLIC  MODEL  CHECKING 


tion.  In  a  later  paper  [LN91],  Lin  shows  how  this  relation  (represented 
as  an  OBDD)  can  be  used  for  state  minimization,  using  an  operator 
which  takes  an  equivalence  relation  and  returns  a  relation  which  maps 
every  state  onto  the  least  element  of  its  equivalence  class. 

Bryant  and  Seger  have  taken  an  an  approach  to  formal  verification 
using  OBDDs  based  on  symbolic  simulation  [Bry88,  BBS90.  BS90].  The 
symbolic  simulator  is  similar  to  an  ordinary  logic  simulator,  e.xcept  that 
the  inputs  are  symbolic  values  (variables)  rather  than  numeric  values, 
and  the  outputs  are  given  as  symbolic  functions  in  terms  of  these  vari¬ 
ables.  These  functions  are  represented  by  OBDDs.  The  simulation 
method  gains  a  great  deal  in  efficiency  by  using  an  abstract  interpre¬ 
tation  of  the  circuit  model.  This  abstraction  uses  a  lattice  consisting 
of  the  three  values  0.  1  and  X.  where  .X  is  the  least  upper  bound  of  0 
and  1.  The  circuit  operations  such  as  AND  and  OR  are  abstracted  in 
such  a  way  as  to  be  monotonic  with  respect  to  this  lattice.  Therefore, 
the  output  of  the  abstract  simulation  is  always  an  upper  bound  on  the 
output  of  the  concrete  simulation.  In  many  cases,  a  large  number  of 
the  inputs  and  initial  values  of  state  variables  can  be  replaced  by  X 
without  sacrificing  the  particular  circuit  property  being  proved.  The 
ajt  in  this  technique  is  to  decompose  the  specification  in  such  a  way 
that  each  part  can  be  verified  using  only  a  small  number  of  symbolic 
values  and  X  everywhere  else.  The  simulation  technique  is  limited  to 
a  logic  with  only  next-time  operators.  These  formulcis  can  be  verified 
using  symbolic  simulations  of  finite  execution  sequences.  This  rules  out 
proving  properties  such  as  liveness,  fairness  or  deadlock  freedom,  but 
allows  safety  properties  to  be  proved  using  invariants. 

Bose  and  Fisher  have  demonstrated  a  technique  for  using  repre¬ 
sentation  functions  to  verify  sequential  circuits  using  OBDDs  [BFSOaj. 
.A  representation  function  maps  each  state  of  the  implementation  to  a 
state  of  the  specification  (which  is  also  a  circuit).  Symbolic  simulation 
techniques  can  be  used  to  show  a  kind  of  single  step  equivalence  be¬ 
tween  the  implementation  and  specification  vis  a  ms  this  relation.  .As 
in  the  method  of  Bryant  and  Seger.  this  proof  can  be  decomposed  into 
parts  in  such  a  way  that  each  part  recpiires  only  a  small  number  <)f 
symbolic  variables,  with  the  remaining  circuit  nodes  initialized  to  .X. 
Typically,  an  invariant  is  also  required,  since  the  single  step  equivalence 
only  holds  over  the  reachable  state  space  of  the  implementation.  This 
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technique  is  also  limited  in  that  it  cannot  prove  liveness  or  deadlock 
properties. 

Long  and  Grumberg  have  introduced  an  abstraction  technique  us¬ 
ing  OBDDs  which  is  more  general  than  simply  introducing  X  val¬ 
ues  [CGL92].  Their  technique  uses  an  OBDD  to  express  the  relation 
between  the  abstract  and  concrete  domains.  The  abstract  transition 
relation  is  automatically  derived  using  OBDD  techniques  from  the  con¬ 
crete  transition  relation.  This  can  be  done  in  a  compositional  way  to 
reduce  the  number  of  symbolic  variables  that  are  required.  variety 
of  abstractions  have  been  put  to  use  in  this  way.  For  example,  a  binary 
number  can  be  represented  by  its  remainders  modulo  a  set  of  relatively 
prime  numbers.  This  has  allowed  the  use  of  the  Chinese  remainder  the¬ 
orem  to  prove  the  correctness  of  a  multiplier  circuit.  In  another  case, 
a  single  bit  was  used  to  represent  whether  a  given  binary  number  in  a 
circuit  is  equal  to  a  given  symbolic  binary  value.  In  this  way  the  entire 
function  of  the  arithmetic  unit  was  abstracted  away,  allowing  a  data 
pipeline  circuit  with  64  64-bit  registers  to  be  verified.  This  abstrac¬ 
tion  technique  is  quite  general,  and  is  closely  related  to  more  classical 
abstraction  techniques  [Kur87].  The  difference  is  that  function  graph 
methods  are  used  to  actually  compute  the  abstract  transition  relation, 
rather  than  giving  this  relation  a  priori. 


96 


CHAPTER  2.  SYMBOLIC  MODEL  CHECKING 


Chapter  3 

The  SMV  system 


The  SMV  system  is  a  tool  for  checking  finite  state  systems  against 
specifications  in  the  temporal  logic  CTL.  The  input  language  of  SMV 
is  designed  to  allow  the  description  of  finite  state  systems  that  range 
from  completely  synchronous  to  completely  asynchronous,  and  from  the 
detailed  to  the  abstract.  One  can  readily  specify  a  system  as  a  syn¬ 
chronous  Mealy  machine,  or  cis  an  asynchronous  network  of  abstract, 
nondeterministic  processes.  The  language  provides  for  modular  hierar¬ 
chical  descriptions,  and  for  the  definition  of  reusable  components.  Since 
it  is  intended  to  describe  finite  state  machines,  the  only  basic  data  types 
in  the  language  are  finite  scalar  types.  Static,  structured  data  types 
can  also  be  constructed.  The  logic  CTL  allows  a  rich  class  of  temporal 
properties,  including  safety,  liveness,  fairness  and  deadlock  freedom,  to 
be  specified  in  a  concise  synta.x.  SMV  uses  the  OBDD-based  symbolic 
model  checking  algorithm  to  efficiently  determine  whether  specifica¬ 
tions  expressed  in  CTL  are  satisfied. 

The  primary  purpose  of  the  SMV  input  language  is  to  provide  a 
symbolic  description  of  the  transition  relation  of  a  finite  Kripke  struc¬ 
ture.  .\ny  propositional  formula  can  be  used  to  describe  this  relation. 
This  provides  a  great  deal  of  flexibility,  and  at  the  same  time  a  cer¬ 
tain  danger  of  inconsistency.  For  example,  the  presence  of  a  logical 
contradiction  can  result  in  a  deadlock  -  a  state  or  states  with  no  suc¬ 
cessor.  This  can  make  some  specifications  vacuously  true,  and  makes 
the  description  unimplementable.  While  the  model  checking  process 
can  be  used  to  check  for  deadlocks,  it  is  best  to  avoid  the  problem 
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when  possible  by  using  a  restricted  description  style.  The  SMV  system 
supports  this  by  providing  a  parallel-cissignment  syntax.  The  semantics 
of  assignment  in  SMV  is  similar  to  that  of  single  assignment  data  flow 
languages.  A  program  can  be  viewed  as  a  system  of  simultaneous  equa¬ 
tions,  whose  solutions  determine  the  next  state.  By  checking  programs 
for  multiple  assignments  to  the  same  variable,  circular  dependencies, 
and  type  errors,  the  compiler  insures  that  a  program  using  only  the 
assignment  mechanism  is  implementable.  Consequently,  this  fragment 
of.  the  language  can  be  viewed  as  a  hardware  description  language,  or 
a  programming  language.  The  SMV  system  is  by  no  means  the  last 
word  on  symbolic  model  checking  techniques,  nor  is  it  intended  to  be  a 
complete  hardware  description  language.  It  is  simply  an  experimental 
tool  for  exploring  the  possible  applications  of  symbolic  model  checking 
to  hardware  verification. 


3.1  An  informal  introduction 

Before  delving  into  the  syntax  and  semantics  of  the  language.  let  us 
first  consider  a  few  simple  examples  that  illustrate  the  basic  concepts. 
Consider  the  following  short  program  in  the  language. 


MODULE  main 
VAR 


request  :  boolean; 
state  ;  {ready ,  busy)- ; 
ASSIGN 


init (state)  ;=  ready; 
next(state)  :=  case 

state  =  ready  A  request  :  busy; 
1  ;  {ready , busy}; 


SPEC 


esac; 


AG(request  ->  AF  state  =  busy) 


The  input  file  describes  both  the  model  and  the  specification.  The 
model  is  a  Kripke  structure,  whose  state  is  defined  by  a  collection  of 
state  variables,  which  may  be  of  Boolean  or  scalar  type.  The  variable 
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request  is  declared  to  be  a  Boolean  in  the  above  program,  while  the 
variable  state  is  a  scalar,  which  can  take  on  the  symbolic  values  ready 
or  busy.  The  value  of  a  scalar  variable  is  encoded  by  the  compiler 
using  a  collection  of  Boolean  variables,  so  that  the  transition  relation 
may  be  represented  by  an  OBDD.  This  encoding  is  invisible  to  the  user, 
however. 

The  transition  relation  of  the  Kripke  structure,  and  its  initial  state 
(or  states),  are  determined  by  a  collection  of  parallel  assignments  (a 
system  of  simultaneous  equations),  which  are  introduced  by  the  key¬ 
word  .ASSIG.^.  In  the  above  program,  the  initial  value  of  the  variable 
state  is  set  to  ready.  The  next  value  of  state  is  determined  by  the 
current  state  of  the  system  by  assigning  it  the  value  of  the  expression 

case 

state  -  ready  ft  request  :  busy; 

1  ;  {ready, busy} : 
esac ; 

The  value  of  a  case  expression  is  determined  by  the  first  expression 
on  the  right  hand  side  of  a  (  : )  such  that  the  condition  on  the  left  hand 
side  is  true.  Thus,  if  state  =  ready  ft  request  is  true,  then  the  result 
of  the  expression  is  busy,  otherwise,  it  is  the  set  {ready , busy}.  When 
a  .set  is  assigned  to  a  variable,  the  result  is  a  non-deterministic  choice 
among  the  values  in  the  .set.  Thus,  if  the  value  of  status  is  not  ready, 
or  request  is  false  (in  the  current  state),  the  value  of  state  in  the  next 
state  can  be  either  ready  or  busy.  .Non-deterministic  choices  are  useful 
for  describing  systems  which  are  not  yet  fully  implemented  (le..  where 
.some  design  choices  are  left  to  the  implementor),  or  abstract  models  of 
complex  protocols,  where  the  value  of  some  state  variables  cannot  be 
completely  determined. 

Notice  that  the  variable  request  is  not  a.ssigned  in  this  program. 
This  leaves  the  SMV  system  free  to  choose  any  value  for  this  variable, 
giving  it  the  characteristics  of  an  unconstrained  input  to  the  system. 

The  specification  of  the  system  appears  as  a  formula  in  CTL  under 
the  keyword  SPEC.  The  SMV  model  checker  verifies  that  all  po.ssible 
initial  states  satisfy  the  specification.  In  this  case,  the  specification  is 
that  invariantly  if  request  is  true,  then  inevitably  the  value  of  state 
is  busy. 
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The  following  program  illustrates  the  definition  of  reusable  modules 
and  expressions.  It  is  a  model  of  a  3  bit  binary  counter  circuit.  Notice 
that  the  module  name  “main”  has  specicd  meaning  in  SMV,  in  the  same 
way  that  it  does  in  the  C  programming  language.  The  order  of  module 
definitions  in  the  input  file  is  inconsequential. 

MODULE  main 
VAR 

bitO  :  counter.cellCl) ; 

bitl  :  countar.call (bitO. carry _out ) ; 

bit2  :  countar.call (bitl. carry .out ) ; 

SPEC 

AG  AF  bit2 . carry. out 

module  countar.call(carry.in) 

VAR 

valua  :  boolaan; 

ASSIGN 

init (valua)  :»  0; 

aaxt( valua)  :»  valua  +  carry. in  mod  2; 

DEFINE 

carry.out  :*  valua  ft  carry. in; 

In  this  example,  we  see  that  a  variable  can  also  be  an  instance  of  a 
user  defined  module.  The  module  in  this  case  is  counter.cell,  which 
is  instantiated  three  times,  with  the  names  bitO,  bitl  and  bit2.  The 
counter  cell  module  has  one  formal  parameter  caxry_in.  In  the  instance 
bitO.  this  formal  parameter  is  given  the  actual  value  1.  In  the  instance 
bitl,  carryin  is  given  the  value  of  the  expression  bitO .  carry.out . 
This  expression  is  evaluated  in  the  context  of  the  main  module.  How¬ 
ever,  an  expression  of  the  form  a.b  denotes  component  6  of  module  a. 
just  as  if  the  module  a  were  a  data  structure  in  a  standard  program¬ 
ming  language.  Hence,  the  carry  .in  of  module  bitl  is  the  carry.out 
of  module  bitO.  The  keyword  DEFINE  is  used  to  assign  the  expres¬ 
sion  value  4  carry.in  to  the  symbol  carry.out.  Definitions  of  this 
type  are  useful  for  describing  Mealy  machines.  They  are  analogous  to 
macro  definitions,  but  notice  that  a  symbol  can  be  referenced  before  it 
is  defined. 
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The  effect  of  the  DEFINE  statement  could  have  been  obtained  by 
declaring  a  variable  and  assigning  its  value,  as  follows: 

VAR 

carry.out  :  boolean; 

ASSIGN 

carry.out  :=  value  &  carry. in; 

Notice  that  in  this  case,  tlie  current  value  of  the  variable  is  assigned, 
rather  than  the  ne.xt  value.  Defined  symbols  are  sometimes  preferable  to 
variables,  however,  since  they  don't  require  introducing  a  new  variable 
into  the  OBDD  representation  of  the  system.  The  weakness  of  defined 
symbols  is  that  they  cannot  be  given  values  non-deterministically.  An¬ 
other  difference  between  defined  symbols  and  variables  is  that  while 
variables  are  statically  typed,  definitions  are  not.  This  may  be  an  ad¬ 
vantage  or  a  disadvantage,  depending  on  your  point  of  view. 

In  a  parallel-assignment  language,  the  question  arises:  "What  hap¬ 
pens  if  a  given  variable  is  assigned  twice  in  parallel?”  More  seriously: 
■‘What  happens  in  the  case  of  an  absurdity,  like  a  :=  a  +  1;  (as  op¬ 
posed  to  the  sensible  next  (a)  :=  a  +  1;)?”  In  the  case  of  SMV.  the 
compiler  detects  both  multiple  assignments  and  circular  dependencies, 
and  treats  these  as  semantic  errors,  even  in  the  case  where  the  corre¬ 
sponding  system  of  equations  has  a  unicjue  solution.  .Vnother  way  of 
putting  this  is  that  there  must  he  a  total  order  in  which  the  assignments 
can  be  executed  which  respects  all  of  the  data  dependencies.  The  same 
logic  applies  to  defined  symbols.  .-\s  a  result,  all  legal  SMV  programs 
are  realizable. 

By  default,  all  of  the  assignment  statements  in  an  SMV  program 
are  executed  in  parallel  and  simultaneously.  It  is  possible,  however,  to 
define  a  collection  of  parallel  proces.ses.  whose  actions  are  interleaved 
arbitrarily  in  the  execution  sequence  of  the  program.  This  is  useful 
for  describing  communication  protocols,  asynchronous  circuits,  or  other 
systems  whose  actions  are  not  synchronized  (including  synchronous  cir¬ 
cuits  with  more  than  one  clock).  This  technique  is  illustrated  by  the 
following  program,  which  represents  a  ring  of  three  inverting  gates. 

MODULE  main 
VAR 

gatel  :  process  inverter (gateS . output) ; 
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gate2  :  process  inverter (gatel. output) ; 
gates  :  process  inverter(gate2.output) ; 

SPEC 

(AG  AF  gatel. out)  ft  (AG  AF  !gatel.out) 

MODULE  inverter (input) 

VAR 

output  :  boolean; 

ASSIGN 

init (output)  ;®  0; 
next ( output )  : ®  ! input ; 

A  process  is  an  instance  of  a  module  which  is  introduced  by  the  key¬ 
word  process.  The  program  e.xecutes  a  step  by  non-deterministically 
choosing  a  process,  then  executing  all  of  the  assignment  statements  in 
that  process  in  parallel.  It  is  implicit  that  if  a  given  variable  is  not  as¬ 
signed  by  the  process,  then  its  value  remains  unchanged.  Because  the 
choice  of  the  next  process  to  execute  is  non-deterministic.  this  program 
models  the  ring  of  inverters  independently  of  the  speed  of  the  gates. 
The  specification  of  this  program  states  that  the  output  of  gatel  os¬ 
cillates  (te.,  that  its  value  is  infinitely  often  zero,  and  infinitely  often 
I).  In  fact,  this  specification  is  false,  since  the  system  is  not  forced  to 
execute  every  process  infinitely  often,  hence  the  output  of  a  given  gate 
may  remain  constant,  regardless  of  changes  of  its  input. 

In  order  to  force  a  given  process  to  execute  infinitely  often,  we  can 
use  a  fairness  constraint.  .\  fairness  constraint  restricts  the  attention 
of  the  model  checker  to  those  execution  paths  along  which  a  given  CTL 
formula  is  true  infinitely  often.  Each  process  has  a  special  variable 
called  running  which  is  true  if  and  only  if  that  process  is  currently 
e.xecuting.  By  adding  the  declaration 

FAIRNESS 

running 

to  the  module  inverter,  we  can  effectively  force  every  instance  of 
inverter  to  e.xecute  infinitely  often,  thus  making  the  specification  true. 

One  advantage  of  using  interleaving  processes  to  describe  a  sys¬ 
tem  is  that  it  allows  a  particularly  efficient  OBDD  representation  of 
the  transition  relation.  We  observe  that  the  set  of  states  reachable  bv 
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one  step  of  the  program  is  the  union  of  the  sets  of  states  reachable  by 
each  individual  process.  Hence,  rather  than  constructing  the  transi¬ 
tion  relation  of  the  entire  system,  we  can  use  the  transition  relations  of 
the  individual  processes  separately  and  the  combine  the  results  (cf.  sec¬ 
tion  2.4.2).  This  can  yield  a  substantial  savings  in  space  in  representing 
the  transition  relation. 

The  alternative  to  using  proces-ses  to  model  an  asynchronous  circuit 
would  be  to  have  all  gates  execute  simultaneously,  but  allow  each  gate 
the  non-deterministic  choice  of  evaluating  its  output,  or  keeping  the 
same  output  value.  Such  a  model  of  the  inverter  ring  would  look  like 
the  following: 

MODULE  main 
VAR 

gatal  :  inverter (gata3 .output) ; 

gate2  :  invertar(gata2 .output) ; 

gates  :  inverter (gate! .output) ; 

SPEC 

(AG  AF  gatal. out)  ft  (AG  AF  Igatel.out) 

MODULE  inverter (input) 

VAR 

output  ;  boolean; 

ASSIGN 

init (output)  ;=  0; 

next (output)  :=  ! input  union  output; 

The  union  operator  allows  us  to  express  a  nondeterministic  choice 
between  two  expressions.  Thus,  the  next  output  of  each  gate  ran  be 
either  its  current  output,  or  the  negation  of  its  current  input  -  each 
gate  can  choose  non-deterministically  whether  to  delay  or  not.  .\s  a 
result,  the  number  of  possible  transitions  from  a  given  state  can  be 
as  high  as  2”.  where  n  is  the  number  of  gates.  This  sometimes  (but 
not  always)  makes  it  more  expensive  to  represent  the  transition  rela¬ 
tion.  The  relative  advantages  of  interleaving  and  simultaneous  models 
of  asynchronous  systems  are  disctissed  in  section  2.1.2. 

As  a  second  example  of  processes,  the  following  program  uses  a 
variable  semaphore  to  implement  mutual  exclusion  between  two  asyn¬ 
chronous  processes,  fiach  process  has  four  states;  idle,  entering. 
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critical  and  exiting.  The  entering  state  indicates  that  the  process 
wants  to  enter  its  critical  region.  If  the  Vcuriable  semaphore  is  zero,  it 
goes  to  the  critical  state,  and  sets  semaphore  to  one.  On  exiting  its 
critical  region,  the  process  sets  semaphore  to  zero  again. 

module  main 
VAR 

semaphore  :  boolean; 
prod  :  process  user; 
proc2  ;  process  user; 

ASSIGN 

init(semaphore)  0; 

SPEC 

AG  !  (prod. state  *  critical  ft  proc2. state  =  critical) 

MODULE  user 
VAR 

state  ;  {idle, entering, critical, exiting}; 

ASSIGN 

init (state)  idle; 
next (state) 
case 

state  =  idle  :  {idle, entering}; 
state  *  entering  ft  ! semaphore  :  critical; 
state  =  critical  ;  {critical, exiting}; 
state  *  exiting  :  idle; 

1  ;  state; 
esac; 

next (semaphore) 
case 

state  =  entering  ;  1 ; 
state  3  exiting  :  0; 

1  :  semaphore; 
esac; 

FAIRNESS 

running 

If  any  specihcation  in  tlie  program  is  false,  the  SMV  model  checker 
attempts  to  produce  a  counterexample,  proving  that  the  specification  is 
false.  This  is  not  always  possible,  since  formulas  preceded  by  existential 
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path  quantifiers  cannot  be  proved  false  by  a  showing  a  single  execution 
path.  Similarly,  subformulas  preceded  by  universal  path  quantifier  can¬ 
not  be  proved  true  by  a  showing  a  single  e.xecution  path.  In  addition, 
some  formulas  require  infinite  execution  paths  cis  counterexamples.  In 
this  case,  the  model  checker  outputs  a  looping  path  up  to  and  including 
the  first  repetition  of  a  state. 

In  the  case  of  the  semaphore  program,  suppose  that  the  specification 
were  changed  to 

AG  (prod. state  *  entering  ->  AF  prod. state  =  critical) 

In  other  words,  we  specify  that  if  prod  wants  to  enter  its  critical 
region,  it  eventually  does.  The  output  of  the  model  checker  in  this 
case  is  shown  in  figure  The  counterexample  shows  a  path  with 
prod  going  to  the  entering  stale,  followed  by  a  loop  in  which  proc2 
repeatedly  enters  its  critical  region  and  the  returns  to  its  idle  state, 
with  prod  only  executing  only  while  proc2  is  in  its  critical  region. 
This  path  shows  that  the  specification  is  false,  since  prod  never  enters 
its  critical  region.  Note  that  in  tlie  printout  of  an  execution  secpience. 
only  the  values  of  variables  that  change  are  printed,  to  make  it  easier 
to  follow  the  action  in  systems  with  a  large  number  of  variables. 

.\lthough  the  parallel  assignment  mechanism  should  be  suitable  to 
most  purposes,  it  is  possible  in  S.\IV  to  specify  the  transition  relation 
directly  as  a  propositional  formula  in  terms  of  the  current  and  next 
values  of  the  state  variables.  .\ny  current/next  state  pair  is  in  the 
transition  relation  if  and  only  if  the  value  of  the  formula  is  one.  Simi¬ 
larly.  it  is  possible  to  give  the  set  of  initial  states  as  a  formula  in  terms  of 
only  the  current  state  variables.  These  two  functions  are  accomplished 
by  the  TRA.NS  and  INIT  statements  respectively.  .\.s  an  example,  here 
is  a  description  of  the  three  inverter  ring  using  only  TR.V.N.S  and  I.NIT: 

MODULE  main 
VAR 

gatel  ;  inverterCgateS.output) ; 

gate2  :  inverter (gatel .output) ; 

gates  ;  inverter(gate2.output) ; 

SPEC 

(AG  AF  gatel. out)  &  (AG  AF  fgatel.out) 
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specification  is  false 

AG  (prod. state  *  entering  ->  AF  procl.s...  is  false: 

.semaphore  -  0 
.prod,  state  =  idle 
.proc2. state  =  idle 

next  state: 

[executing  process  .prod] 
next  state: 

.prod. state  *  entering 

AF  prod. state  *  critical  is  false: 

[executing  process  .proc2] 
next  state: 

[executing  process  .proc2] 

.proc2. state  =  entering 

next  state: 

[executing  process  .prod] 

.semaphore  =  1 
.proc2. state  *  critical 

next  state: 

[executing  process  .proc2] 
next  state: 

[executing  process  .proc2] 

.proc2. state  =  exiting 

next  state: 

.semaphore  =  0 
.proc2. state  =  idle 


Figure  .Model  checker  output  tor  semaphore  example 
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MODULE  inverter (input) 

VAR 

output  :  boolean; 

INIT 

output  =  0 
TRANS 

next (output)  =  ! input  I  next (output)  =  output 

According  to  the  TRANS  declaration,  for  each  inverter,  the  next  value 
of  the  output  is  ec{ual  either' to  the  negation  of  the  input,  or  to  the 
current  value  of  the  output.  Thus,  in  effect,  each  gate  can  choose  non- 
deterministically  whether  or  not  to  delay.  The  use  of  TRANS  and  INIT 
is  not  recommended,  since  logical  absurdities  in  these  declarations  can 
lead  to  unimplement  able  descriptions.  For  example,  one  could  declare 
the  logical  constant  0  (false)  to  represent  the  transition  relation,  re¬ 
sulting  in  a  system  with  no  transitions  at  all.  However,  the  flexibility 
of  these  mechanisms  may  be  useful  for  those  writing  translators  from 
other  languages  to  SMV. 

To  summarize,  the  SMV  language  is  designed  to  be  flexible  in  terms 
of  the  styles  of  models  it  can  describe.  It  is  possible  to  fairly  concisely 
describe  synchronous  or  asynchronous  systems,  to  describe  detailed  de¬ 
terministic  models  or  ab.stract  nondeterministic  models,  and  to  exploit 
the  modular  structure  of  a  system  to  make  the  description  more  con¬ 
cise.  It  is  also  possible  to  write  logical  absurdities  if  one  desires  to.  and 
also  sometimes  if  one  does  not  desire  to.  using  the  TRANS  and  INIT  dec¬ 
larations.  By  using  only  the  parallel  assignment  mechanism,  however, 
this  problem  can  be  avoided.  The  language  is  designed  to  exploit  the 
capabilities  of  the  symbolic  model  checking  technic|ue.  .\s  a  result  the 
available  data  types  are  all  static  and  finite.  .No  attempt  has  been  made 
to  support  a  particular  model  of  communication  between  concurrent 
processes  {eg.,  synchronous  or  asynchronous  message  passing).  Iji  ad¬ 
dition.  there  is  no  explici'  support  for  some  features  of  communicating 
process  models  such  as  sequential  composition.  Since  the  full  generality 
of  the  symbolic  model  checking  technique  is  available  through  the  SMV 
language,  it  is  possible  that  translators  from  various  languages,  process 
models^  and  intermediate  formats  could  be  created.  In  particular,  ex¬ 
isting  silicon  compilers  could  be  used  to  translate  high  level  languages 
with  rich  feature  sets  into  a  low  level  form  (such  as  a  Mealy  machine) 
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that  could  be  readily  translated  into  the  SMV  language. 


3.2  The  input  language 

This  section  describes  the  various  constructs  of  the  SMV  input  lan¬ 
guage,  and  their  syntax. 


3.2.1  Lexical  conventions 

.\n  atom  in  the  syntax  described  below  may  be  any  sequence  of  char¬ 
acters  in  the  set  {A-Z,a-z,0-9,_,-},  beginning  with  an  alphabetic 
character.  .All  characters  in  a  name  are  significant,  and  case  is  signif¬ 
icant.  Whitespace  characters  are  space,  tab  and  newline.  .Any  string 
starting  vvith  two  dashes  (“ — ”)  and  ending  with  a  newline  is  a  com¬ 
ment.  A  number  is  any  sequence  of  digits.  Any  other  tokens  recognized 
by  the  parser  are  enclosed  in  quotes  in  the  syntax  expressions  below. 


3.2.2  Expressions 

Expressions  are  constructed  from  variables,  constants,  and  a  collection 
of  operators,  including  Boolean  connectives,  integer  arithmetic  opera¬ 
tors,  and  case  expressions.  The  syntax  of  expressions  is  as  follows. 


Qxpr  : 

atom 


number 

id 

"!"  expr 

exprl 

"ft"  expr2 

exprl 

"1"  expr2 

exprl 

"->"  expr2 

exprl 

"<->"  expr2 

exprl 

"="  expr2 

exprl 

"<"  expr2 

exprl 

">"  expr2 

exprl 

"<="  expr2 

exprl 

">="  expr2 

exprl 

"+"  expr2 

a  symbolic  constant 
a  numeric  constant 
a  variable  identifier 
logical  not 
logical  and 
logical  or 
logical  implication 
logical  equivalence 
equality 
less  than 
greater  than 
less  that  or  equal 
greater  than  or  equal 
integer  addition 
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I  exprl  expr2 

I  exprl  expr2 

I  exprl  expr2 

I  exprl  "mod"  expr2 
1  "next"  "("  id  ")" 
I  set.expr 
I  caae.expr 


integer  subtraction 
integer  multiplication 
integer  division 
integer  remainder 
next  value 
a  set  expression 
a  case  expression 


An  id.  or  identifier,  is  a  symbol  or  expression  which  identifies  an 
object,  such  as  a  variable  or  defined  symbol.  Since  an  id  can  be  an 
atom,  there  is  a  possible  ambiguity  if  a  variable  or  defined  symbol  has 
the  same  name  as  a  symbolic  constant.  Such  an  ambiguity  is  flagged 
by  the  compiler  as  an  error.  The  expression  next(x)  refers  to  the  value 
of  identifier  x  in  the  next  state  (see  section  3.2.3).  The  order  of  parsing 
precedence  from  high  to  low  is 


♦,/ 

mod 

I 

->,<-> 


Operators  of  equal  precedence  associate  to  the  left.  Parentheses 
may  be  used  to  group  expressions. 

.\  case  expression  has  the  syntax 

case_expr  : 

"case" 

expr.al  expr.bl 

expr_a2  8xpr_b2 

"esac" 

A  case  expression  reMirns  rhe  value  of  the  first  expression  on  the 
right  hand  side,  such  that  the  corresponding  condition  on  the  left  hand 
side  is  true.  Thus,  if  expr.al  is  true,  then  the  result  is  expr.bl.  Oth¬ 
erwise.  if  expr_a2  is  true,  then  the  result  is  expr_b2.  c/r.  If  none  of 
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the  expressions  on  the  left  hand  side  is  true,  the  result  of  the  case 
expression  is  the  numeric  value  1.  It  is  an  error  for  any  expression  on 
the  left  hand  side  to  return  a  value  other  than  the  truth  values  0  or  1. 
A  set  expression  has  the  syntax 

set.expr  : 

vail  val2  ...  '•>" 

I  axprl  "in"  expr2  ;;  set  inclusion  predicate 

1  exprl  "union"  expr2  ; ;  set  union 

A  set  can  be  defined  by  enumerating  its  elements  inside  curly  braces. 
The  elements  of  the  set  can  be  numbers  or  symbolic  constants.  The 
inclusion  operator  tests  a  value  for  membership  in  a  set.  The  union 
operator  takes  the  union  of  two  sets.  If  either  argument  is  a  number  or 
symbolic  value  instead  of  a  set.  it  is  coerced  to  a  singleton  set. 


3.2.3  Declarations 

The  VAR  declaration 

A  state  of  the  model  is  an  assignment  of  values  to  a  set  of  state  variables. 
These  variables  (and  also  instances  of  modules)  are  declared  by  the 
notation 

decl  : :  "VAR" 

atoml  " ; "  typel  " ; " 
atom2  type2 


The  type  associated  with  a  variable  declaration  can  be  either  Boolean, 
scalar,  or  a  user  defined  module.  .\  type  specifier  has  the  syntax 

type  : :  boolean 

I  "{"  vail  val2  ...  ">" 

I  atom  [  "("  exprl  expr2  ...  ")"  ] 

1  "process"  atom  C  "("  exprl  ","  expr2  "."  ...  ")"  ] 

val  :  atom  I  number 

A  variable  of  type  boolean  can  lake  on  the  numerical  values  0  and 
I  (representing  false  and  true,  respectively).  In  the  case  of  a  list  of 
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values  enclosed  in  set  brackets  (where  atoms  are  taken  to  be  symbolic 
constants),  the  variable  is  a  scalar  which  can  take  any  of  these  val¬ 
ues.  Finally,  an  atom  optionally  followed  by  a  list  of  expressions  in 
parentheses  indicates  an  instance  of  module  atom  (cf.  section  3.2.4). 
The  keyword  process  causes  the  module  to  be  instantiated  as  an  asyn¬ 
chronous  process  (cf.  section  ;}.2.6). 

The  ASSIGN  declaration 

An  assignment  declaration  has  the  form 

dad  ::  "ASSIGN" 

dastl  frxprl  ";" 

dast2  ":="  expr2 


dost  : :  atom 

I  "init"  "("  atom  ")" 

I  "next"  "("  atom  ")" 

On  the  left  hand  side  of  the  assignment,  atom  denotes  the  cur¬ 
rent  value  of  a  variable,  init(atom)  denotes  its  initial  value,  and 
next  (atom)  denotes  its  value  iii  the  next  state.  If  the  expression  on 
the  right  hand  side  evaluates  to  an  integer  or  symbolic  constant,  the 
a.ssignment  simply  means  that  the  left  hand  side  is  equal  to  the  right 
liand  side.  On  the  other  liand.  if  the  expression  evaluates  to  a  set.  then 
the  assignment  means  that  the  left  hand  side  is  contained  in  that  set. 
It  is  an  error  if  the  value  of  the  expression  is  not  contained  in  the  range 
of  the  variable  on  the  left  hand  side. 

In  order  for  a  program  to  l)e  implementable.  there  must  he  some 
order  in  which  the  assignments  can  be  e.xecuted  such  that  no  variable 
is  assigned  after  its  value  is  referenced.  This  is  not  the  case  if  tliere 
is  a  circular  dependency  among  tire  assignments  in  any  given  process, 
flence.  such  a  condition  is  an  error,  fn  addition,  it  is  an  error  for  a 
variable  to  be  assigned  more  than  once  simultaneously.  To  be  precise, 
it  is  an  error  if; 

1.  the  next  or  current  value  of  a  variable  is  a.ssigned  more  than  once 
in  a  given  process,  or 
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2.  the  initial  value  of  a  variable  is  assigned  more  than  once  in  the 
program,  or 

3.  the  current  value  and  the  initial  value  of  a  variable  are  both  as¬ 
signed  in  the  program,  or 

4.  the  current  value  and  the  next  value  of  a  variable  are  both  as¬ 
signed  in  the  program,  or 

5.  there  is  a  circular  dependency,  or 

6.  the  current  value  of  a  variable  depends  on  the  next  value  of  a 
variable. 

The  TRANS  declaration 

The  transition  relation  R  of  the  model  is  a  set  of  current  state/next 
state  pairs.  Whether  or  not  a  given  pair  is  in  this  set  is  determined  by 
a  Boolean  valued  expression,  introduced  by  the  TRANS  keyword.  The 
syntax  of  a  TRANS  declaration  is 

decl  : :  "TRANS"  expr 

It  is  an  error  for  the  expression  to  yield  any  value  other  than  0  or  1. 
If  there  is  more  than  one  TRANS  declaration,  the  transition  relation  is 
the  conjunction  of  all  of  TRANS  declarations. 

The  INIT  declaration 

The  set  of  initial  states  of  the  model  is  determined  by  a  Boolean  ex¬ 
pression  under  the  INIT  keyword.  The  syntax  of  a  INIT  declaration 
is 

decl  : :  "INIT"  expr 

It  is  an  error  for  the  expression  to  contain  the  nextO  operator, 
or  to  yield  any  value  other  than  0  or  I.  If  there  is  more  than  one 
INIT  declaration,  the  initial  set  is  the  conjunction  of  all  of  the  INIT 
declarations. 
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The  SPEC  declaration 

The  system  specification  is  given  as  a  formula  in  the  temporal  logic 
CTL,  introduced  by  the  keyword  SPEC.  The  syntax  of  this  declaration 
is 


decl  : ;  "SPEC"  ctlform 


.\  CTL  formula  has  the  synta.x 


ctlform  : 
expr 

I  "!"  ctlform 
I  ctlforml  "ft"  ctlform2 
I  ctlforml  " I "  ctlf orm2 
1  ctlforml  ctlform2 

I  ctlforml  "<->"  ctlform2 
1  "E"  pathform  ; 

I  "A"  pathform  ; 


; ;  a  Boolean  expression 
: :  logical  not 
; ;  logical  and 
; ;  logical  or 
; ;  logical  implies 
: ;  logical  equivalence 
existential  path  quantifier 
universal  path  quantifier 


The  syntax  of  a  path  formula  is 


pathform  : : 

"X"  ctlform 
"F"  ctlform 
"G"  ctlform 
ctlforml  ”U"  ctlform2 


next  time 
eventually 
globally 
until 


The  order  of  precedence  of  ofierators  is  (from  high  to  low) 


E,A,X,F,G,U 


->,<-> 


Operators  of  equal  precedence  associate  to  the  left.  Parentheses 
may  be  used  to  group  expressions.  It  is  an  error  for  an  expression  in  a 
(,'TL  formula  to  contain  a  next()  operator  or  to  return  a  value  other 
than  0  or  1.  It  there  is  more  than  one  SPEC  declaration,  the  specification 
is  the  conjunction  of  all  of  the  SPEC  declarations. 
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The  FAIR  declaration 

A  fairness  constraint  is  a  CTL  formula  which  is  assumed  to  be  true 
infinitely  often  in  all  fair  execution  paths.  When  evaluating  specifica¬ 
tions,  the  model  checker  considers  path  quantifiers  to  apply  only  to  fair 
paths.  Fairness  constraints  are  declared  using  the  following  syntax: 

decl  ::  "FAIR"  ctlform 

A  path  is  considered  fair  if  and  only  if  all  fairness  constraints  de¬ 
clared  in  this  manner  are  true  infinitely  often. 

The  DEFINE  declaration 

In  order  to  make  descriptions  more  concise,  a  symbol  can  be  associated 
with  a  commonly  userl  expression.  The  syntax  for  this  declaration  is 

decl  ::  "DEFINE" 

atoml  exprl 

atom2  expr2 


When  every  an  identifier  referring  to  the  symbol  on  the  left  hand 
side  occurs  in  an  expression,  it  is  replaced  by  the  value  of  the  expression 
on  the  right  hand  side  (not  the  expression  itself).  Forward  references 
to  defined  symbols  are  allowed,  but  circular  definitions  are  not  allowed, 
and  result  in  an  error. 

3.2.4  Modules 

A  module  is  an  encapsulated  cc^fiection  of  declarations.  Once  tlefined.  a 
module  can  be  reused  as  many  times  as  necessary.  Modules  can  also  be 
parameterized,  so  that  each  instance  of  a  module  can  refer  to  different 
data  values.  .\  module  can  contain  instances  of  other  modules,  allowing 
a  structural  hierarchy  to  be  built.  The  syntax  of  a  module  is  as  follows. 

nodule  ; 

C  "OPAQUE"  ] 

"MODULE"  atom  [  "("  atoml  "."  atom2  ...  ")"  ] 
decll 
dacl2 
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The  optional  keyword  OPAQUE  is  explained  in  the  section  on  identi¬ 
fiers.  The  atom  immediately  following  the  keyword  MODULE  is  the  name 
associated  with  the  module.  Module  names  are  drawn  from  a  separate 
name  space  from  other  names  in  the  program,  and  hence  may  clash 
with  names  of  variables  and  definitions.  The  optional  list  of  atoms  in 
parentheses  are  the  formal  parameters  of  the  module.  Whenever  these 
parameters  occur  in  expressions  within  the  module,  they  are  replaced 
by  the  actual  parameters  which  are  supplied  when  the  module  is  in¬ 
stantiated. 

instance  of  a  module  is  created  using  the  VAR  declaration  '  cf. 
section  3.2.3).  This  declaration  supplies  a  name  for  the  instance,  and 
also  a  list  of  actual  parameters,  which  are  assigned  to  the  formal  pa¬ 
rameters  in  the  module  definition.  .\n  actual  parameter  can  be  any 
legal  expression.  It  is  an  error  if  tlie  number  of  actual  parameters  is 
different  from  the  number  of  formal  parameters.  The  semantics  of  mod¬ 
ule  instantiation  is  similar  to  call-by-reference.  For  e.xample,  consider 
the  following  program  fragment: 


VAR 

a  :  boolean; 
b  :  foo(a); 

MODULE  foo(x) 

ASSIGN 

X  :=  1; 

The  variable  a  is  assigned  the  value  1.  Now  consider  the  following 
program: 


DEFINE 
a  :  =  0 ; 

VAR 

b  :  bar(a)  ; 

MODULE  bar(x) 
DEFINE 
a  :=  1; 
y  :=  x; 
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In  this  program,  the  value  assigned  to  y  is  0.  Using  a  call-by-name 
(macro  expansion)  mechanism,  the  value  of  y  would  be  1,  since  a  would 
be  substituted  as  an  expression  for  x. 

Forward  references  to  module  names  are  allowed,  but  circular  ref¬ 
erences  are  not.  and  result  in  an  error. 

3.2.5  Identifiers 

An  id.  or  identifier,  is  an  expression  which  references  an  object.  Objects 
are  instances  of  motlules.  variables,  and  defined  symbols.  The  syntax 
of  an  identifier  is  as  follows. 

id  : : 

atom 

I  id  " . "  atom 

.An  atom  identifies  tlie  object  of  that  name  as  defined  in  a  VAR  or 
DEFINE  declaration.  If  a  identifies  an  instance  of  a  module,  then  the 
expression  a.b  identifies  the  component  object  named  6  of  instance  a. 
This  is  precisely  analogous  to  accessing  a  component  of  a  structured 
data  type.  .Vote  tliat  an  actual  parameter  of  module  instance  a  can 
identify  another  module  instance  h.  allowing  a  to  access  components  of 
h.  as  in  the  following  example: 


VAR 

a  :  foo(b) ; 
b  ;  bar(a) ; 

MODULE  foo(x) 
DEFINE 

c  :s  x.p  I  x.q; 

MODULE  bar(x) 

VAR 

p  :  boolean; 
q  :  boolean: 


Here,  the  value  of  c  i.s  the  logical  or  of  p  and  q.  If  the  keyword 
OPAQUE  a()pears  before  a  module  definition,  then  the  variables  of  an  in- 
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stance  of  that  module  are  not  externally  accessible.  Thus,  the  following 
program  fragment  is  not  legal: 

VAR 

a  :  foo(); 

DEFINE 
b  :=  a.x; 

OPAQUE  MODULE  fooO 
VAR 

X  :  boolean; 


3.2.6  Processes 

Processes  are  used  to  model  interleaving  concurrency,  with  shared  vari¬ 
ables.  A  process  is  a  module  which  is  instantiated  using  the  keyword 
process  (cf.  section  3.2.3).  The  program  executes  a  step  by  non- 
deterministically  choosing  a  process,  then  executing  all  of  the  assign¬ 
ment  statements  in  that  process  in  parallel,  simultaneously.  Each  in¬ 
stance  of  a  process  has  special  variable  Boolean  associated  with  it  called 
running.  The  value  of  this  variable  is  1  if  and  only  if  the  process  in¬ 
stance  is  currently  selected  for  execution.  The  rule  for  determining 
whether  a  given  variable  is  allowed  to  change  value  when  a  given  pro¬ 
cess  is  executing  is  as  follows:  if  the  next  value  of  a  given  variable  is 
not  assigned  in  the  currently  executing  process,  but  is  assigned  in  some 
other  process,  then  the  next  value  is  the  same  as  the  current  value. 

3.2.7  Programs 

The  syntax  of  an  S.\IV  program  is 

program  : 

module 1 
module2 


There  must  be  one  module  with  the  name  main  and  no  formal  pa¬ 
rameters.  The  module  main  is  the  one  instantiated  by  the  compiler. 
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3.3  Formal  semantics 

In  this  section  we  assign  a  formal  semantics  to  SMV  programs.  In 
essence,  a  program  is  viewed  as  a  system  of  ecpiations  whose  solutions 
determine  the  transition  relation  and  initial  states  of  a  Kripke  structure. 
In  fact,  this  semantics  assigns  meaning  to  some  programs  which  are  not 
actually  accepted  by  the  compiler  due  to  the  rules  regarding  multiple 
assignments  and  circular  dependencies.  Here,  we  define  a  semantics  for 
a  subset  of  the  language  which  does  not  include  the  process  keyword. 
This  subset  will  be  called  SMV.O.  The  seniantics  of  SMV.O  is  syntax 
directed  -  the  denotation  of  a  program  is  a  function  of  the  denotations 
of  its  syntactic  components.  It  is  also  compositional  with  regard  to 
bisimulation  and  simulation,  as  we  will  prove  in  chapter  5.  This  makes 
it  possible  to  use  compositional  proof  methods  for  verifying  SMV.O 
programs,  incliuling  induction  over  the  structure  of  programs.  The 
semantics  for  .S.VIVM.  which  includes  the  process  keyword,  is  given  in 
appendix  .\. 

3.3.1  The  model 

The  set  .V  of  names,  is  the  set  of  all  character  strings  made  up  of 
the  letters,  the  digits,  the  underscore  and  the  minus  sign  characters, 
beginning  with  a  letter.  The  store  L  =  Lv  ^  Lh  made  up  of  two 
disjoint,  countably  infinite  sets  of  locations  Lv  and  L^-  We  will  call 
the  former  the  oisible  locations,  and  the  latter  the  hidden  locations. 
The  set  of  locations  L  is  defined  recursively.  It  is  the  least  set  such 
that 

1.  if  n  €  .V.  then  n  ^  Ly.  and 

2.  if  /  €  L\-  and  n  G  .V.  then  l.n  G  Lv-  and 

3.  if  /  G  Lv ■  then  .1  G  Lh- 

The  set  of  values  V’  is  the  union  of  the  integers  in  the  range  [—2'^'.  2^*  —  Ij 
and  .V.  the  set  of  names.  .\  state  x  :  iL  — *  V  is  a  function  from  locations 
to  values.  L<?t  .s'  =  .  V  be  the  set  of  all  possible  states. 

If  p  is  a  declaration,  then  its  denotation  {p|  is  a  triple  (  T.  I.  R).  The 
T  component  is  a  partial  function  from  L  to  the  finite  subsets  of  W 
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If  /  is  a  location,  then  T{1),  when  defined,  is  the  type  of  I  -  the  set  of 
values  that  can  be  assigned  to  location  /.  The  component  I  C  S  is  the 
set  of  initial  states.  Finally,  the  component  R  C  S  x  S  is  the  transition 
relation. 

In  the  following  sections,  we  define  the  denotations  of  the  various 
kinds  of  declarations.  We  then  define  a  composition  operator  ||  which 
gives  the  denotation  of  a  program  in  terms  of  its  declarations. 

3.3.2  Expressions 

.An  expression  denotes  a  function  from  states  to  finite  subsets  of  V. 
according  to  the  following  rules: 

1.  If  i'  is  a  value,  then  Icjl  r)  =  {c}. 

2.  If  /  is  a  location,  then  ~  {.r(/)}. 

If  Ci.t)  are  expressions,  and  o  is  one  of 

+.  -.  ♦.  /.  mod.  >.  >=.  <•  <=,  =.  &.  I.  ->.  <-> 


then 


[ei  o  cjKx)  =  {Hlt’i-f’i)  I  t’l  €  |ei|(j).  i’2  6  !le2l(r)} 


4.  If  e  is  an  expression,  then 


|!e](.r)  =  {![!|(e)  j  e  €  Ie|(.r)} 


5.  If  ei,t2  are  expressions. 


|ei  union  e-2j](j:)  =  JeJ  U  |e2i 


6.  If  fei.t2  are  expressions. 


ei  in  f-ill-r)  =  ie,|  C  {62! 
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The  functions  denoted  by  +,  *,  /  are  the  usual  functions  of  arith¬ 

metic  modulo  2'^.  The  function  denoted  by  mod  is  the  positive  re¬ 
mainder  of  division  mod  2^^.  The  functions  denoted  by  the  relational 
operators  >.  >=.  <  and  <=  return  0  when  the  relation  is  false  and  1  when 
the  relation  is  true,  and  are  defined  for  numeric  values  only.  For  non- 
numeric  values,  liiey  return  ±.  The  equality  operator  =  is  defined  for 
all  values,  and  returns  0  when  they  are  unequal,  and  1  when  they  are 
equal.  The  fimclioiis  denoted  by  the  Boolean  operators  are  k  (for  and). 
I  (for  or).  !  ( for  not).  ->  (for  implies)  and  <->  (for  logical  equivalence) 
are  defined  only  for  the  values  0  and  1.  and  return  ±  otherwise. 

3.3.3  Assignments  and  definitions 

There  is  no  semantic  differetice  l)etween  assignments  and  definitions. 
If  /  is  a  location,  and  <■  is  an  expression,  then  the  assignment  /  :=  c; 
denotes  a  triple  (  /'./.  /2).  where 

1.  r  =  d) 

2.  /  =  v 

;{.  R  =  {Ir.i/I  €  I  /(.r)  € 

The  assignment  next(/)  :=  <■:  denotes  a  triple  (  T.  I.  R)  where 


1.  r  =  0 


2.  I  =  >■ 

R  =  {(./■• //I  H  s'-  I  l(y)  €  IeJ(.r)} 

The  assigiuiK’iit  init(/)  :=  e:  denotes  a  triple  (T.  /.  R)  where 

1.  r  =  d) 

2.  s'  I  /(,{•)  ^  |tj(.r)} 


.1  R  =  ,s- 
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3.3.4  Variable  declarations 

If  /  is  an  identifier  and  ui,  ■  ■ . .  t’n  are  values,  then 

VAR/  : 

denotes  a  triple  (T.  /,  R)  where 

1.  T  =  {(/.  {t’l.  0-2, - (-,1})} 

-•  /  =  {.c  €  5  I  x(I}  E  {t’t.  {’) . 

3.  R=  {(x./y)  e  5  I  x(/).,7(/)  e  {I’l.i'o . y„})} 

3.3.5  Renaming 

Let  (5  ;  £  — ►  Z,  be  a  functioti  IVoiii  locations  to  locations.  This  in  turn 
induces  a  map  <I>  on  states,  such  t.liat  for  all  states  x  and  locations  /. 

<D(x)(/)  =  x(o(/)). 

If  .VI  =  (T.  I.  R).  then  let  o(  .V/)  =  {T'.  I'.  R')  where 

1.  r(o(/))  =  r(/). 

2.  r  =  {x  1  <I>(x)  e  /}  and 

3.  R'  =  {(x./y)  I  (<^(x ).<!>( 7))  €  /?}. 

.Vote  that  the  definition  of  T  dues  not  make  sense  if  o  maps  two  loca¬ 
tions  with  different  types  onto  the  same  location.  In  this  case.  o(.\£) 
is  a  type  error.  There  are  two  rules  regarding  the  renaming  function  o 
which  must  be  respected  to  allow  compositional  reasoning  about  SM\ 
programs.  These  are: 

1.  .A  hidden  location  cannot  be  renamed  to  a  visible  location,  and 

2.  Two  distinct  locations  cannot  be  mapped  to  the  same  hidden 
location. 


These  rules  are  respected  by  the  SMV.O  semantics.  Notice  that  it  is 
allowable  to  rename  visible  locations  to  hidden  locations.  In  this  way. 
we  can  accomplish  both  hiding  and  renaming  with  the  same  operator. 
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3.3.6  Parallel  composition 

The  parallel  composition  of  two  processes  M\  l{  M2  is  formed  in  two 
steps.  First,  a  renaming  is  applied  to  map  the  hidden  locations  of  Mi 
and  M2  onto  disjoint  spaces.  Then  the  union  of  the  type  functions  T 
and  the  intersections  of  the  initial  sets  /  and  the  transition  relations 
R  are  taken.  Clearly,  this  does  not  make  sense  if  the  T  components 
do  not  agree  on  the  type  of  some  location,  since  the  union  would  not 
be  a  function.  Formally,  let  Mi  =  (Ti,IiiRi)  and  A/)  =  (T^, />,/?>). 
Let  ni  and  fi>  be  two  distinct  names.  For  i  €  1,2,  let  0i(l)  =  .n,./  for 
all  I  €  Lh  and  6i{l)  =  I  otherwise,  and  let  A/'  =  ©(A/,).  The  parallel 
composition  A/  =  Mi  |1  M2  is  defined  cis  follows: 

1.  T  =  T[  u  r. 

2.  /  =  /( n  /' 

3.  /?=/?;  n  R'. 

If  di,d2 . dk  are  declarations,  then  [di  d>  ...  (4|  is  the  parallel 

composition 

m  ii  id^i  11  •  •  ■  II  ic4i 


3.3.7  Instantiation 

Suppose  that  module  A  is  defined  as  follows: 


MODULE  .4(ni,«2 . Ufc)  D 

where  rii.  112 . are  distinct  names  and  D  is  a  sequence  of  declara¬ 

tions.  Let  r.  /(.  4,  ■  ■  •  •  4  be  visible  locations.  Let  ©  be  a  renaming,  such 
that,  for  all  /  €  Z-K. 

1.  for  all  I  <  <’  <  4”  ©(n,)  =  /,,  and  0(n,.l)  =  /,./, 

2.  for  all  n  €  .V  —  {nj.  ©(n. )  =  '-.n.  and  ©(  n.l)  =  r.n.l. 

3.  ©(./)  =  ./ 
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Then|VARr  :  A{li,  h,  ■  ■  ■  •  U);]  =  (p{D). 

On  the  other  hand,  suppose  that  .4  is  defined  as  follows: 


OPAQUE  MODULE  nit)  D 

where  ni,  no, njt  are  tlistinct  names  and  D  is  a  sequence  of  declara¬ 
tions.  Let  r.  /i,  C  be  visible  locations.  Let  mi  and  mo  be  distinct 

names  in  N.  Let  ©  be  a  reiiafuing.  such  that,  for  all  I  €  Lv. 

1.  for  all  1  <  1  <  A;;  ©(n,)  =  /,.  and  ©(n,./)  = 

2.  for  all  n  €  iV  —  {ni.iio . /ifc },  ©(n)  =  mi.n.  and  <D{n.l]  = 

.irii.n.l. 

■i.  ©(./)  =  .nio.l 

Then  |VAR  r  :  /.> . j  =  -a  D'i. 

3.3.8  Specifications 

Each  program  is  associated  with  a  Kripke  structure  which  determines 
the  truth  value  of  (  'TL  formulas  in  the  specification.  The  atomic  propo¬ 
sitions  in  this  case  are  all  the  Boolean  valued  expressions.  The  Kripke 
structure  associated  with  a  |)rogram  whose  denotation  is  the  triple 
{T.  I.  R)  is  a  Kripke  model  A  =  i R.  U)  where 

1.  -S’  is  the  set  of  stat<\s  deHncd  al)ovp. 

1.  R  is  the  transition  relation,  and 

3.  if  t  is  an  expression,  i  hen 

L'{t  \  =  {.re  .<  1  |e|(.r)  =  {1}} 

The  specification  is  a  formula  /  in  ( 'TL  with  fairness  constraints.  It  is 
satisfied  exactly  when  A .  s,,  |=  f  for  all  .s,)  G  /• 
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Chapter  4 

A  Distributed  Cache 
Protocol 


In  this  chapter,  we  look  at  an  application  of  the  SMV'  symbolic  model 
checker  to  a  cache  consistency  protocol  developed  at  Encore  Computer 
Corporation  for  their  Gigamax  distributed  multiprocessor  [MS9l|.  This 
protocol  is  of  interest  as  a  test  case  for  automatic  verification  for  two 
reasons.  First,  it  is  not  a  theoretical  e.xercise,  but  a  real  design,  which 
is  driven  by  considerations  of  performance  and  economics,  as  well  as 
the  usual  constraints  of  industrial  flesign.  such  as  compatibility  with 
existing  hardware  and  software.  Second,  this  protocol  is  a  good  example 
of  a  system  where  random  simulation  methods  are  ineffective  in  finding 
design  errors. 

The  Gigamax  is  a  distributed,  shared  memory  multiprocessor,  in 
which  the  processors  are  grouped  into  clusters.  Each  cluster  has  a  local 
bus.  and  uses  bus  snooping  [.AB86)  to  maintain  cache  consistency  within 
the  cluster.  In  addition,  each  cluster  has  an  interface  called  a  l.TC. 
which  links  the  cluster  into  a  network.  The  FTC  keeps  the  caches  in  the 
cluster  consistent  with  the  rest  of  the  network  by  acting  as  both  a  bus 
snooper  and  a  bus  master  on  behalf  of  the  remote  clusters,  using  a  table 
which  keeps  track  of  the  remote  status  of  all  cache  blocks  from  the  local 
main  memory.  This  allows  it  to  intervene  in  bus  transactions  which 
affect  remotely  owned  blocks,  and  to  send  appropriate  invalidation  or 
call  back  requests  to  the  network.  The  network  is  organized  into  a 
hierarchy,  as  depicted  in  figure  l.l.  The  global  bus,  at  the  top  of  the 
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global  bus 


Figure  4.1:  Gigamax  memory  architecture 


hierarchy,  lias  one  UIC  connected  to  each  cluster.  These  UICs  record 
the  status  of  all  cache  blocks  which  are  present  in  the  corresponding 
cluster.  This  eliminates  the  need  for  directory  pointers  in  main  memory, 
at  the  po.ssible  expense  of  a  bottleneck  in  the  global  bus. 

Protocols  such  as  this  are  difficult  to  debug  using  simulation,  in  part 
because  the  order  of  events  such  as  cache  misses  and  message  arrivals 
in  various  parts  of  the  system  is  unpredictable.  Subtle  errors  some¬ 
times  require  a  long  sequence  of  such  events  to  manifest  themselves. 
Since  the  number  of  such  sequences  is  combinatoric,  the  probability  of 
such  a  sequence  occurring  in  a  random  simulation  rapidly  vanishes  as 
the  sequence  length  increases.  Nevertheless,  for  the  design  process  to 
stabilize,  it  is  necessary  to  provide  timely  information  about  errors  to 
the  design  team,  since  the  greater  the  delay  in  discovering  an  error,  the 
greater  is  the  disruption  required  to  fix  it.  Ideally,  a  protocol  should  be 
error  free  before  a  hardware  (or  software)  implementation  is  considered. 
Otherwise,  the  options  for  fixing  the  errors  will  be  greatly  limited  by 
cost  considerations,  and  the  likelihood  of  the  design  change  introducing 
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other  errors  will  be  high. 

For  this  reason,  we  will  consider  the  verification  of  the  Gigamax 
protocol  at  a  high  level  of  abstraction,  neglecting  many  admittedly 
important  details  of  the  implementation,  such  as  the  widespread  use 
of  pipelining,  or  the  link  level  protocol  that  communicates  messages 
between  clusters.  The  basic  method  for  building  an  abstract  model 
of  a  protocol  is  to  introtluce  nondeterminism  in  those  cases  where  the 
level  of  detail  of  the  model  is  insufficient  to  uniquely  determine  the 
outcorrie  of  an  event,  or  wliere  design  decisions  have  been  left  open. 
We  will  make  a  note  of  places  in  the  model  where  nondeterminism  has 
been  used  in  this  way.  and  in  what  way  the  state  of  an  implementation 
might  correspond  to  the  state  of  the  abstract  model. 


4.1  The  Protocol 

The  purpose  of  a  cache  consistency  system  is  to  provide  the  illusion 
to  the  programmer  of  a  distributed  computer  that  all  processors  in 
the  system  have  access  to  a  shared  global  store.  This  illusion  must 
be  provided  despite  the  fact  that  the  physical  storage  is  distributed. 
To  reduce  the  latency  of  access  to  the  distributed  main  storage,  each 
processor  is  provided  with  a  local  cache  -  a  semi-associative  store,  which 
holds  a  collection  of  memory  blocks  recently  used  by  the  processor.  The 
time  required  to  access  to  this  store  is  less  than  to  access  main  storage. 
.■\n  access  to  a  memory  block  stored  in  the  cache  is  called  a  hit.  while 
an  access  to  a  memory  block  not  stored  in  the  cache  is  called  a  mi.ss. 
.\  miss  requires  an  access  to  main  storage  (which  may  be  remote),  to 
retrieve  the  required  memory  block  and  enter  it  in  the  cache.  This  may 
result  in  the  replacement  of  another  block  in  the  cache,  to  make  room 
for  the  block  being  entered  in  the  cache.  If  the  replaced  block  has  l)een 
modified  while  in  the  cache,  it  must  be  returned  to  main  storage.  This 
is  called  a  copy  back  operation. 

The  first  cache  consistency  protocols  for  multiprocessors  were  called 
bus  snoopiny  protocols  [.ABSfij.  They  required  that  the  processors  in 
the  system  be  connected  by  a  l)us.  or  other  broadcast  medium.  In  a 
bus  snooping  .system,  each  time  a  memory  access  occurs  over  the  bus. 
all  of  the  caches  are  clu’cked  to  determine  whether  they  contain  the 
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addressed  block.  If  the  block  is  present  in  a  cache,  a  change  in  status 
may  be  required.  For  example,  if  the  block  is  present  and  modified,  the 
access  must  be  stalled  while  the  modified  data  are  copied  back  to  main 
storage.  In  a  more  sophisticated  protocol,  the  cache  with  the  modified 
data  may  supply  the  data  directly  to  the  requesting  cache,  without  the 
intermediary  of  main  storage.  In  case  of  a  memory  access  caused  by 
an  attempt  to  modify  the  data,  all  caches  in  which  the  block  is  present 
must  invalidate,  tliat  is,  remove  the  block  from  cache  storage.  This 
insures  that  all  cached  copies  of  the  block  remain  consistent. 

The  Gigamax  protocol  uses  bus  snooping  techniques  to  maintain 
consistency  of  the  caches  within  a  single  cluster.  The  main  difference 
l)etween  the  Gigamax  snooping  protocol  and  those  described  in  [AB86] 
is  that  the  Gigamax  uses  a  ■■iplit  transaction  bus.  This  means  that  a 
processor  accessing  memory  over  the  bus  first  places  a  request  on  the 
bus,  and  then  frees  the  bus  for  other  transactions  while  awaiting  a 
response.  The  l)us  snooping  technique  is  not  practical  for  large  scale 
multiprocessors,  because  the  broadccust  medium  quickly  becomes  satu¬ 
rated.  For  this  reason,  the  Gigamax  uses  a  message  passing  protocol  to 
maintain  consistency  between  clusters.  The  split  transaction  bus  pro¬ 
tocol  allows  traffic  to  continue  on  the  bus  while  messages  are  in  transit 
in  the  network. 

The  terminology  used  in  the  sequel  is  changed  somewhat  from  the 
Encore  terminology,  and  the  protocol  is  somewhat  simplified  to  make 
the  presentation  dearer.  The  basic  protocol  is  preserved,  however,  in¬ 
cluding  a  subtle  error  which  was  discovered  by  the  SMV  system.  The 
following  is  a  description  of  the  protocol,  first  in  English,  then  in  the 
.bMV  input  language.  In  the  model,  we  consider  only  the  status  of  a 
single  memory  block.  This  is  our  first  use  of  abstraction,  and  results  in 
nondeterminism  in  several  places  in  the  model. 

4.1.1  Processors 

Each  memory  block  stored  in  each  cache  has  an  associated  itale.  which 
can  be  either  invalid,  shared,  or  owned.  Alternative  names  for  these 
states  would  be  absent,  present,  and  modified,  respectively.  The  shared 
state  indicates  that  there  may  be  other  processors  which  have  this  block 
stored  in  their  cache.  Therefore,  a  block  in  the  shared  state  ran  be 
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read  by  the  processor,  but  not  written,  since  writing  might  result  in  an 
inconsistency  between  two  caches.  The  owned  state  indicates  that  no 
other  processors  have  this  block  in  their  cache,  and  that  the  data  in  the 
cache  have  been  modified,  'rherefore.  a  block  in  the  owned  state  can 
be  both  read  and  written  by  the  processor.  The  invalid  state  indicates 
that  the  block  is  not  present  in  the  cache.  Therefore,  the  block  cannot 
be  read  or  written  by  the  processor. 

MODULE  cache-device 
VAR 

state  :  { invalid, shar ed, ouned> ; 


DEFINE 

readable  :=  ((state  =  shared)  |  (state  *  owned))  &  Iwaiting; 

writable  :*  (state  =  owned)  &  Iwaiting; 

The  split  transaction  bus  snooping  protocol  works  in  the  following 
way.  .\t  each  bus  cycle,  the  bus  arbiter  chooses  a  processor  among  the 
requesting  processors  to  be  the  bus  master.  The  remaining  processors 
are  referred  to  as  .■slaves.  The  master  issues  a  command  on  the  l>us.  of 
which  there  are  three  basic  t\  pes.  .\  rrad  command  is  a  request  for  a 
given  memory  block,  and  is  answered  by  a  rc.-iponse  command.  .\  write 
command  stores  data  in  main  memory.  The  write  and  response  com¬ 
mands  can  be  combined  into  a  single  command  called  a  write.-respon.'ie. 
which  has  the  simultaneous  etfect  of  supplying  data  to  a  ie(|uester  and 
storing  it  in  main  memory.  F.ach  command  also  signals  the  ne.xt  state 
that  the  bus  master  will  enter.  Thus,  a  read-owned  command  indicates 
that  the  bus  master  intends  to  modify  the  data,  and  a  read-shared  in¬ 
dicates  that  it  does  not.  wnte-shared  indicates  that  the  bus  master 
is  writing  data,  but  maintaining  a  shared  ropy,  while  a  write-invalid 
indicates  that  it  is  not  keeping  the  block  \eg..  it  is  replacing  it  with 
another  block).  The  basic  commands,  and  their  uses  are  summarized 
in  table  4.1.  We  note  that  no  <'xlernal  command  is  required  to  go  from 
the  shared  state  to  the  invalid  state.  This  occurs  when  a  shared  block 
is  removed  to  make  room  for  another  block  in  the  cache.  .Since  our 
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model  does  not  contain  the  states  of  any  other  blocks,  we  allow  this  re¬ 
placement  to  occur  nondeterministically,  at  any  time.  Thus  we  model 
any  possible  cache  replacement  policy. 

A  slave,  observing  a  command  on  the  bus,  may  decide  to  modify  its 
state.  For  example,  a  slave  observing  a  read-owned  command  changes 
its  state  to  invalid,  since  the  bus  master,  entering  the  owned  state,  will 
assume  it  has  the  only  cached  copy  of  the  block.  Correspondingly,  a 
slave  in  the  owned  state  observing  a  read-shared  command  will  change 
to  the  shared  state.  A  special  command  called  invalidate  is  used  to 
invalidate  all  caches  in  the  system.  A  slave  observing  this  command 
changes  to  the  invalid  state. 

ASSIGN 

init(state)  :=  invalid; 
next (state) 
case 

abort  :  state; 
master  : 
case 

CMD  »  read- shared 
CMD  a  read-owned 
CMD  a  write- invalid 
CMD  a  write-shared 
1  :  state ; 
esac; 

! master  : 
case 

CMD  a  read-owned  :  invalid; 

CMD  a  invalidate  ft  ! waiting  ;  invalid; 

CMD  a  read-shared  ft  state  =  owned  :  shared; 
state  a  shared  ft  ! waiting  :  {shared , invalid} ; 

1  :  state ; 
esac; 
esac; 

On  receiving  the  command,  each  slave  checks  its  own  cache  and 
indicates  the  state  of  the  block  in  its  own  cache  by  asserting  the  signals 
reply-owned,  and  reply-waiting  on  the  bus.  These  are  wired  or  signals, 
meaning  that  the  signal  is  observed  to  be  assertetl  on  the  bus  if  one  or 


:  shared ; 

:  owned ; 

:  invedid; 
;  shared ; 
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from  state 

command 

to  state 

cause 

invalid 

read-shared 

shared 

read  miss 

invalid 
or  shared 

read-owned 

owned 

write  miss 

owned 

write-invalid 

invalid 

copy- back 

owned 

write-resp-iiivalid 

invalid 

snoop  read-owned 

owned 

write-shared 

shared 

write-through 

owned 

write- resp-shared 

invalid 

snoop  read-shared 

Table  4.1:  Summary  of  commands 

more  caches  as.sert  the  signal.  The  reply-owned  signal  is  asserted  by  a 
slave  when  the  block  is  in  the  owne<l  stale  in  the  slave's  cache.  Reply¬ 
waiting  is  asserted  when  the  slave  has  previously  requested  the  block, 
and  is  waiting  for  a  response.  This  signal  will  be  discussed  in  more 
detail  shortly.  The  process  of  looking  up  the  slave's  state  and  signaling 
on  the  bus  is  known  as  bus  snooping.  On  observing  a  read  command,  a 
slave  in  the  owned  state  sets  a  flag  called  snoop.  This  causes  the  cache 
to  issue  a  write-response  at  a  later  bus  cycle,  supplying  the  data  to  the 
requester,  and  simultaneously  storing  it  in  main  memory.  When  this 
happens,  the  snoop  Hag  is  reset. 

.\n  additional  reply  signal  called  rtpiy-stull  may  be  a.sserted  !)y  any 
slave,  including  main  storage,  if  the  slave  if  not  ready  to  respond  to  the 
command  because  some  resource  is  busy.  If  reply-stall  is  as.seited.  the 
command  is  nullified. 

DEFINE 

raply-owned  :=  state  =  owned; 

VAR 

snoop  :  boolean; 

ASSIGN 

init (snoop)  :=  0; 
next (snoop)  ;= 
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case 

abort  :  snoop ; 

state  B  osned  ft  CMD  »  read-shared  :  1; 
state  B  owned  ft  CMD  -  read-owned  :  1; 

CMD  B  response  :  0; 

CMD  B  write-resp-invalid  :  0; 

CMD  B  write-resp-shared  :  0; 

1  :  snoop : 
esac; 

After  issuing  a  read  command,  the  master  releases  the  bus  and  waits 
for  a  response.  During  this  time,  a  flag  called  waiting  is  set.  Normally, 
if  no  slave  asserts  reply-owned,  the  response  comes  from  main  memory. 
If  any  slave  asserts  reply-owned,  however,  main  memory  is  inhibited, 
allowing  the  slave  to  supply  the  data  at  a  future  cycle  with  a  write- 
response  command. 

MODULE  bus-device 

VAR 

master  :  boolean; 

cmd  :  {idle .read-shared, read-owned, cty-read, write-invalid, 
write-sheured, write-resp-invalid, write-resp-shared, 
invalidate .response} ; 
waiting  :  boolean; 
reply-stall  ;  boolean; 


ASSIGN 

init (waiting)  :*  0; 
next (waiting) 
case 

abort  :  waiting; 

master  ft  CMD  =  read-shared  :  1 ; 

master  ft  CMD  »  read-owned  ;  1 ; 

CMD  B  response  :  0; 

CMD  *  write-resp-invalid  :  0; 

CMD  B  write-resp-shared  :  0; 

1  :  waiting; 
esac ; 
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A  slave  which  is  waiting  for  a  given  cache  block  responds  to  any 
read  command  for  that  block  by  asserting  reply-waiting.  This  nullifies 
the  read  command  and  forces  the  master  to  retry  at  a  later  cycle. 

DEFINE 

reply-waiting  :=  waiting: 
abort  :=  REPLY-STALL 

I  ((CMD  =  read-shared  1  CHD  =  read-owned) 
t  REPLY-WAITING); 

The  commands  which  may  be  issued  by  a  processor  when  it  is  bus 
master  are  a  function  e^'  the  state.  For  example,  if  the  snoop  flag  is 
set,  the  processor  m<y  issue  a  write- response  on  the  bus.  From  the 
owned  state,  a  pr  cessor  may  issue  a  write-invalid  command  in  order 
to  replace  th"  cache  block  with  another.  .\  processor  in  the  shared  state 
may  issue  a  read-owned  in  case  of  a  write  miss,  and  a  processor  in  the 
invalid  state  may  issue  either  a  read-shared  or  a  read-owned  command, 
in  case  of  a  read  miss  and  write  miss  respectively. 

MODULE  processor (CMD , REPLY-OWNED , REPLY- WAITING , REPLY-STALL , DATA ) 
ISA  bus-device 
ISA  cache-device 

ASSIGN 
cmd  :  = 
case 

master  ft  snoop  ft  state  =  invalid  :  write-resp-invalid; 
master  ft  snoop  ft  state  =  shared  :  write-resp-shared; 
master  ft  state  =  owned  ft  Iwaiting  ;  write- invalid; 
master  ft  state  =  shared  ft  iwaiting  :  read-owned; 
master  ft  state  *  invalid  :  {read-shared.read-owned} ; 

1  :  idle ; 
esac; 

4.1.2  The  local  UIC  interface 

The  LT(,'  is  the  interface  from  one  cluster  to  another.  (TCs  come  in 
pairs,  connected  by  a  communication  link.  .\  LTC  is  said  to  be  local 
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for  a  given  memory  block  if  that  block  is  found  in  main  storage  on  the 
same  side  of  the  link  as  the  UIC.  It  is  said  to  be  remote  if  the  memory 
block  is  found  in  a  main  memory  on  the  other  side  of  the  link.  Thus, 
for  any  memory  block,  one  of  the  UICs  in  the  pair  is  local,  and  the 
other  remote.  The  UIC  determines  whether  it  is  local  or  remote  by 
address  decoding.  We  consider  the  local  case  first.  In  this  discussion. 
local  refers  to  any  part  of  the  system  on  the  bus  side  of  the  UIC,  and 
remote  refers  to  any  part  of  the  system  on  the  link  side  of  the  UIC. 

Viewed  from  the  bus,  the  UIC  behaves  like  a  processor,  with  the 
capability  to  issue  and  respond  to  commands.  The  UIC's  cache  records 
the  state  (but  not  the  data)  of  all  blocks  of  the  local  main  storage  that 
are  present  in  remote  caches.  This  allows  the  UIC  to  snoop  the  bus  on 
behalf  of  remote  caches.  The  UIC  performs  this  function  in  exactly  the 
same  manner  as  the  processors.  The  state  of  a  block  in  the  UIC  changes 
with  commands  issued  in  the  same  manner  as  the  state  of  cache  blocks 
in  processor  caches. 

The  UIC  receives  command  messages  from  from  the  link,  and  stores 
them  in  one  of  two  queues.  The  low  priority  queue  is  for  read  com¬ 
mands.  and  the  high  priority  queue  is  for  all  other  commands.  The 
depth  of  the  queues  is  arbitrary,  but  for  now.  we  consider  queues  of 
only  one  entry.  A  command  in  one  of  the  queues  is  issued  on  the  bus 
when  the  UIC  becomes  master.  If  both  queues  are  non-empty,  the 
command  in  the  high  priority  queue  is  issued  first.  Provided  the  com¬ 
mand  is  not  aborted,  the  queue  issuing  the  command  is  emptied.  Sinc  e 
the  UIC  becomes  bus  master  at  nondeterministic  intervals,  the  delay 
between  the  time  a  message  arrives  in  the  queue  and  is  issued  on  the 
bus  is  arbitrary.  This  nondeterminism  covers  two  abstractions  made  in 
the  model.  First,  it  allows  for  any  amount  of  latency  in  the  link  level 
protocol,  which  is  not  modeled.  Second,  it  allows  the  time  to  issue 
an  arbitrary  number  of  messages  relating  to  other  memory  blocks  that 
may  be  queued  ahead  of  the  one  message  that  is  modeled. 

MODULE  receiver 
VAR 

hiq  :  {none .response .write-shared ,write-resp-shared , 
write-invalid, write-resp- invalid. invalidate} ; 

loq  :  {none, read-owned, read-shared, cty-read}; 
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ASSIGN 

and 

case 

master  &  ! (hiq  =  none)  :  hiq; 
master  ft  ! (loq  =  none)  ;  loq; 

1  :  idle ; 
esac; 

init(hiq)  :=  none; 
next (hiq)  := 
case 

! master  I  abort  :  hiq; 

1  :  none ; 
esac; 

init(loq)  :=  none; 
next (loq)  := 
case 

Imaster  I  abort  I  ! (hiq  =  none)  :  loq; 

1  :  none; 
esac; 

The  local  UIC  can  send  commands  to  the  link  in  response  to  com¬ 
mands  observed  on  the  local  bus.  Whenever  a  read  command  is  sent 
to  the  link,  it  is  entered  in  the  remote  UIC’s  low  priority  queue.  If  any 
other  command  is  sent  to  the  link,  it  is  entered  in  the  remote  ITC’s 
high  priority  queue.  If  the  remote  queue  is  full,  the  local  bus  cycle  is 
stalled. 

MODULE  sender 
DEFINE 

lopri  :=  sending  in  {read-shared, read-owned, cty-read>; 
hipri  :=  sending  in  {invalidate, response, write-shared, 

write-invalid ,  write-resp-sheired ,  write-resp-invalid> ; 

ASSIGN 

next (remote. hiq)  := 
case 

(abort  ft  remote. hiq  =  none  ft  hipri  :  sending; 

1  :  remote. hiq: 
esac ; 
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next (remote. loq)  := 
case 

! abort  ft  remote. loq  »  none  ft  lopri  :  sending; 

1  :  remote. loq; 
esac; 

reply-stall  ;= 

(hipri  ft  ! (remote. hiq  =  none)  1 
lopri  ft  ! (remote. loq  =  none))  union  1; 

The  local  UIC  sends  command  messages  to  the  link  in  two  cases. 
The  first  is  to  invalidate  or  call  back  cache  blocks  in  remote  caches. 
This  occurs  when  the  UIC  is  a  slave  and  a  read-owned  or  read-shared 
is  received  on  the  bus.  If  the  UIC  is  in  the  owned  state,  the  read- 
owned  or  read-shared  is  forwarded  to  the  link.  This  causes  the  remote 
cache  in  the  owned  state  to  issue  a  write-resp-invalid  or  write-resp- 
shared.  returning  the  cache  block  to  the  local  bus.  If  the  UIC  is  in  the 
shared  state,  and  a  read-owned  is  received  on  the  bus.  an  invalidate 
command  is  forwarded  to  the  link.  This  causes  all  remote  caches  to 
go  to  the  invalid  state.  Note  that  this  may  allow  a  processor  on  the 
local  bus  to  write  before  the  invalidate  command  has  reached  all  re¬ 
mote  caches.  This  is  a  possible  violation  of  strict  consistency,  which  is 
tolerated  for  performance  reasons.  Hence,  the  protocol  does  not  imple¬ 
ment  a  strongly  consistent  memory  model.  The  memory  model  which 
the  protocol  does  support  will  be  discussed  in  more  detail  in  the  next 
section. 

The  second  case  in  which  the  local  UK.'  sends  a  command  to  the  link 
is  when  the  UIC  has  issued  a  read-shared  or  read-owned  and  is  waiting 
for  a  response.  In  this  Ccise,  if  the  ITC  is  a  slave  and  a  response,  write- 
resp-shared.  or  write-resp-invalidate  is  a.sserted  on  the  bus.  a  response 
is  sent  to  the  link. 

MODULE  local-UIC(rQmotQ,CMD, REPLY-OWNED, REPLY-WAITING, 
REPLY-STALL, DATA) 

ISA  bus-device 
ISA  cache-device 
ISA  receiver 
ISA  sender 
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DEFINE 

sending  := 
case 

master  ;  none; 

CMD  =  read-shared  St  state  =  owned  :  read-shau:ed ; 

CMD  =  read-owned  &  state  =  owned  :  read-owned; 

CMD  =  read-owned  &  state  =  shared  :  invalidate: 

CMD  =  write-resp-invalid  St  waiting  :  write-resp- invalid; 

CMD  *  write-resp-shared  &  waiting  :  write-resp-shaured; 

CMD  =  response  St  waiting  :  response; 

1  :  none ; 
esac; 


4.1.3  The  Remote  UIC  interface 

VVhen  the  UIC  is  remote,  it  betiaves  as  it'  it  were  a  main  storage 
device.  It  accepts  read-shar<xl.  read-owned,  write-shared,  and  write- 
invalid  commands  from  the  bus.  and  forwards  them  to  the  local  UIC 
via  the  link.  VVhen  the  response  arrives  in  the  high  priority  queue,  it 
issues  the  response  on  the  bus.  In  addition,  it  can  provide  a  special  ser¬ 
vice  to  caches  on  the  local  side.  If  the  remote  UIC  issues  a  read-shared 
or  read-owned  command,  and  there  is  no  reply  on  the  remote  bus  (;>.. 
no  slave  asserts  reply-owned),  it  is  assumed  that  the  block  was  copied 
i)ack  to  main  storage  while  the  read  command  was  in  transit.  The 
remote  UIC  therefore  sends  the  read  command  back  to  the  local  side. 
This  operation  is  called  a  courtesy  read.  The  courtesy  read  will  cause 
the  main  store  on  the  local  bus  to  respond  to  the  original  requester. 

MODULE  remote-UICCremote , CMD .REPLY-OWNED .REPLY-WAITING . 
REPLY-STALL. DATA) 

ISA  bus-device 
ISA  receiver 
ISA  sender 

DEFINE 

sending  := 
case 
master  : 
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case 

CMD  *  read-shared  &  ! REPLY-OWNED  :  cty-read; 

CMD  -  read-owned  ft  ! REPLY-OWNED  :  cty-read; 

1  :  none ; 
esac; 

! master  : 
case 

CMD  *  read-shared  ft  ! REPLY-OWNED  :  read-shared; 

CMD  =  read-owned  ft  .'REPLY-OWNED  :  read-owned; 

CMD  *  write-resp-invalid  ft  waiting  :  write-resp-invalid; 

CMD  *  write-resp-shared  ft  waiting  :  write-resp-shared; 

CMD  -  write-resp-shared  ft  ! waiting  :  write-shared; 

CMD  *  write-shared  :  write-shared; 

CMD  =  write-invalid  :  write- invalid; 

1  :  none; 
esac; 
esac; 

reply-owned  :»  0; 

The  text  for  the  complete  model  in  the  S.VIV  language  includes 
such  details  as  an  abstracted  model  of  main  storage  and  the  cluster 
bus.  which  ties  the  above  modules  together.  These  are  omitted  here. 
Each  cluster  is  modeled  as  an  asynchronous  process.  Hence,  the  early 
quantification  method  for  disjunctive  relations  can  be  used  to  av^oid 
constructing  the  global  transition  relation  (cf.  section  2. 1.2). 


4.1.4  Protocol  example 

As  an  example  of  the  protocol  in  operation,  consider  the  sequence  of 
events  depicted  in  figures  4.2  and  4.3.  in  the  figures,  clusters  i  and  2  are 
both  remote  {ie..  the  memory  block  in  question  resides  in  some  other 
cluster).  The  sequence  begins  when  a  read  miss  occurs  in  a  processor 
in  cluster  2,  while  a  processor  in  cluster  1  is  in  the  owned  state.  .\t  this 
point,  the  following  sequence  of  events  might  occur: 

1.  The  processor  in  cluster  2  issues  a  read-shared  command  on  the 
bus.  and  sets  its  waiting  flag. 
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2.  The  UIC  in  cluster  2  sends  the  read-shared  command  up  the 
link,  storing  it  in  the  low  priority  queue  of  the  global  bus  UIC  for 
cluster  2. 

3.  The  global  bus  UIC  for  cluster  2  issues  the  read-shared  command 
on  the  global  bus,  entering  the  shared  state,  and  setting  its  wait¬ 
ing  flag. 

4.  Since  the  global  bus  UIC  for  cluster  I  is  in  the  owned  state,  it 
asserts  reply-owned,  sends  the  read-shared  command  down  the 
link  to  cluster  1.  enters  the  shared  state,  and  sets  its  snoop  flag. 

o.  The  UIC  in  cluster  I  issues  this  read-shared  command,  entering 
the  shared  state  and  setting  its  waiting  flag. 

6.  The  processor  in  cluster  I  in  the  owned  state  asserts  reply-owned, 
enters  the  shared  state,  and  sets  its  snoop  flag. 

7.  The  processor  in  cluster  I  issues  a  write-resp-shared  command, 
containing  the  block  data,  and  clears  its  snoop  flag. 

5.  The  UIC  in  cluster  I  sends  the  write-resp-shared  command  up 
the  link,  storing  it  in  the  high  priority  queue  of  of  the  global  bus 
UIC  for  cluster  1.  and  clears  its  waiting  flag. 

9.  The  global  bus  UIC  for  cluster  I  issues  the  write-response-shared 
command  on  the  global  bus.  and  clears  its  waiting  flag. 

10.  (a)  The  global  bus  I  IC  connected  to  main  memory  .sends  a  write- 
shared  command  containing  the  block  data  and  (b)  The  global 
bus  UIC  for  cluster  2  sends  a  response  command,  clearing  its 
waiting  flag. 

11.  The  UIC  in  cluster  2  issues  the  response  command. 

12.  The  requesting  processor  in  cluster  2  stores  the  data  in  its  cache, 
and  clears  its  waiting  flag. 
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4.2  Verifying  the  protocol 

VVe  now  consider  the  problem  of  formal  specification  and  verification  of 
the  protocol.  The  properties  we  will  be  concerned  with  are: 

1.  freedom  from  deadlock. 

2.  sequential  consistency,  and 

3.  local  safety  conditions,  related  to  diagnostics. 

Using  the  symbolic  model  checking  technique,  we  can  verify  these  prop¬ 
erties  automatically,  despite  the  very  large  state  space  of  the  model.  In 
fact,  the  model  checker  discovered  a  fairly  subtle  bug  in  the  protocol  - 
an  e.\ecution  sequence  leading  to  a  deadlocked  state. 

4.2.1  Freedom  from  deadlock 

We  will  say  that  the  protocol  is  deadlocked  if  it  reaches  a  state  in 
which  some  processor  is  permanently  blocked  from  receiving  access  to 
the  given  memory  block.  Thus,  our  definition  of  deadlock  takes  in  situ¬ 
ations  that  might  also  be  called  livelock.  in  which  the  system  continues 
to  loop  infinitely,  but  without  the  possibility  of  making  progress.  VVe 
can  e.xpress  this  property  in  ('TL  with  the  following  formula,  which 
must  hold  for  all  processors: 

.4Tf'(  E F rfadable  A  E F writable)  I  4. 1 ) 

In  other  words,  it  it  always  |)os,sible  that  the  memory  block  will  be¬ 
come  readable  by  the  given  processor,  and  always  possible  that  it  will 
become  writable.  We  can  check  this  property  using  SMV  l)y  adding  the 
following  specification  to  the  processor  module: 

SPEC 

AG(EF  readable  &  EF  writable) 

The  specification  turns  out  to  l)e  false,  and  as  a  i:ountere.xample. 
the  model  checker  produces  an  e.xecution  trace  leading  to  a  deadlocked 
state.  This  is  an  actual  bug  in  the  original  protocol  which  was  found  by 
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the  model  checker,  but  not  in  behavioral  simulations.  The  complexity 
of  the  counterexample,  and  the  unusual  sequence  of  events  that  leads 
to  the  deadlock  should  give  some  indication  of  why  this  error  would 
be  unlikely  to  occur  in  random  simulations.  The  time  required  to  pro¬ 
duce  the  counterexample  was  slightly  under  ten  minutes  running  on  a 
Sun  3/60. 

The  steps  of  the  counterexample  are  depicted  in  figures  4.4  to  4.6. 
Cluster  1  is  the  local  cluster,  and  clusters  2  and  above  are  remote 
clusters.  VVe  pick  up  the  counterexample  at  a  point  where  a  processor 
in  cluster  2  is  in  the  owned  state: 

1.  A  read  miss  occurs  in  a  processor  in  cluster  1.  This  processor 
issues  a  read-shared  command  on  the  bus.  It  enters  the  shared 
state  and  sets  its  waiting  flag. 

2.  Since  the  UIC  in  cluster  1  is  in  the  owned  state,  it  asserts  reply- 
owned,  enters  the  shared  state,  and  sends  a  read-shared  command 
up  the  link,  storing  it  in  the  low  priority  queue  of  the  global  bus 
UIC  for  cluster  1. 

3.  A  processor  in  cluster  3  also  issues  a  read-shared  command.  .\s 
a  result,  the  global  bus  UIC  for  cluster  3  issues  the  read-shared 
command  on  the  global  bus.  entering  the  shared  state,  and  setting 
its  waiting  flag. 

4.  Since  the  global  bus  UIC  for  cluster  2  is  in  the  owned  state,  il 
asserts  reply-owned,  sends  a  read-shared  command  down  the  link 
to  cluster  2.  enters  the  shared  stale,  and  sets  its  snoop  flag. 

5.  The  UIC  in  cluster  2  issues  this  read-shared  command,  entering 
the  shared  state  and  setting  its  waiting  flag. 

6.  The  processor  in  cluster  2  in  the  owned  state  asserts  reply-owned, 
enters  the  shared  stale,  aiul  sets  its  snoop  flag. 

7.  The  processor  in  cluster  2  issues  a  write-resp-shared  commaiul. 
containing  the  block  data,  and  clears  its  snoop  Hag. 
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8.  The  LTC  in  cluster  2  sends  the  write-resp-shared  command  up 
the  link,  storing  it  in  the  high  priority  queue  of  of  the  global  bus 
UIC  for  cluster  1,  and  clears  its  waiting  flag. 

9.  The  global  bus  UIC  for  cluster  2  issues  the  write-response-shared 
command  on  the  global  bus,  and  clears  its  waiting  flag. 

10.  (a)  The  global  bus  UIC  connected  to  main  memory  (cluster  1) 
sends  a  write-shared  command  containing  the  block  data  and  (b) 
The  global  bus  UIC  for  cluster  •{  sends  a  response  command, 
clearing  its  waiting  flag. 

11.  The  UIC  in  cluster  L  issues  the  write-shared  command. 

12.  The  block  data  are  stored  in  main  memory. 

13.  .A  processor  in  cluster  3  again  issues  a  read-shared  command.  .As 
a  result,  the  global  bus  UIC  for  cluster  3  issues  the  read-shared 
command  on  the  global  l)us.  entering  the  shared  state,  and  setting 
its  waiting  flag. 

14.  Since  read-owned  is  not  asserted,  the  UIC  for  cluster  1  sends  the 
read-shared  command  tlovvn  the  link  towards  main  memory. 

.At  this  point,  the  system  is  deadlocked.  The  original  read-shared 
command  sent  in  step  1  in  cluster  1  is  still  in  the  low  priority  queue 
at  th  global  bus  level,  but  is  stalled  by  the  waiting  flag  set  in  the 
global  UIC  for  cluster  3.  Similarly,  the  read-shared  command  sent  by 
cluster  3  is  in  the  low  priority  queue  in  the  cluster  1  LTC.  but  is  stalled 
by  the  waiting  flag  of  the  original  requester.  This  is  an  example  of  the 
cleissic  deadlock  situation  which  occurs  when  two  processes  attempt 
to  obtain  locks  on  two  resources  (in  this  case  two  buses)  in  different 
orders.  .Nonetheless,  the  sec|uence  of  events  that  lead  to  this  situation 
were  sufficiently  complex  that  the  designers  did  not  anticipate  that  the 
situation  could  occur,  and  simulations  did  not  produce  it.  In  fact,  the 
deadlock  situation  was  hnind  at  a  search  depth  of  thirteen  transitions. 
■At  each  step  in  this  sequence,  there  were  several  alternatives  that  might 
have  averted  the  deadlock.  Tims  it  is  possible,  but  unlikely  that  this 
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initially  owned 
4)  reply-owned  asserted 
sends  read-shared 
->  shared,  snoop 

global  bus  / 


initiidly  owned 
2)  asserts  reply-owned 
sends  read-shared 
->shared,  snoop 

cluster  bus 


BE [p]  - 


3)  read-shared  issued . 
->shared,  waiting 


UlC  5)  read-shared  issued 
waiting  set 


1 )  read  miss  initially  owned 

issues  read-shared  6)  reply-owned  asserted 

->shared,  waihng  ->shared,  snoop 


Figure  4.4:  Deadlock  example 


global  bus 


10a)  sends  write-shared 
to  main  memory  me 


1 1)  issues  write-shared 

cluster  bus 


M 


12)stores  data 


10b)  seruis  response 
...  clears  waiting 


\9)  issues  write-resp 
clears  snoop 


UlC  8)  sends  write-resp 
clears  waiting 


7)  issues  write-resp 
clears  snoop 


Figure  4.5:  Deadlock  example  (cont.) 
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global  bus 


14)  sends  read-public 
to  rnain  memory  UlC 

read-shared  still 
pending 

can't  issue  read-pubiic 

cluster  bus 


issues  read-shared 
->  shared,  waiting 


waiting  still  set 


Figure  4.6:  Deadlock  example  (cont.) 


deadlock  would  be  found  by  a  random  .simulation  run.  or  a  simulation 
run  based  on  address  traces.* 

The  fact  the  the  model  checker  was  able  to  print  out  automatically 
an  example  of  this  deadlock  highlights  an  important  practical  aspect 
of  the  technique.  Counterexamples  are  of  perhaps  even  greater  value 
than  a  proof  that  the  system  is  correct,  since  such  a  proof  is  based  on 
the  assumption  that  the  system  is  correctly  modeled,  and  the  specifica¬ 
tion  is  correct  and  complete.  .\  countere.xample.  however,  provides  an 
important  clue  as  to  where  a  Inig  in  the  system  lies,  and  how  it  might 
be  corrected. 


4.2.2  Correcting  the  deadlock 

The  problem  causing  the  deadlock  is  that  the  remote  owner  of  the 
memory  block  can  write  the  data  back  to  main  memory  while  a  read 

4n  fad,  the  number  of  possible  transitions  from  a  given  state  ranges  from  6  to 
12.  The  probability  of  a  random  simulation  run  e.xecuting  this  trace  is  therefore 
in  the  range  t)“'^  =  7.7  x  10" ''  to  12” =  0.4  x  10”'^  The  e.Kpected  time  for  a 
random  simulation  to  e.Khibit  this  behavior  would  be  somewhere  between  2. 1  years 
and  29  millenia,  assuming  the  simulation  could  be  carried  out  at  10.000  steps  per 
second. 
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command  from  the  local  cluster  is  in  transit  to  the  remote  cluster.  The 
write  command  crosses  the  read  command  in  the  mail,  so  to  speak.  A 
remote  request  for  the  same  block  can  then  lock  the  global  bus,  leading 
to  deadlock.  The  Encore  engineers  corrected  the  deadlock  problem  in 
the  following  way.  The  write  command,  when  it  reaches  the  local  bus. 
is  converted  by  the  UIC  into  a  write- response  command.  This  supplies 
data  to  the  local  requester  and  frees  the  local  bus.  Unfortunately,  it 
also  leaves  an  orphan  read  command  in  the  system.  If  a  read  command 
from  the  remote  side  is  issued  on  the  local  bus.  and  a  remote  proces¬ 
sor  subsequently  reaches  the  owned  state,  the  orphan  read  command 
will  disrupt  th  protocol.  To  prevent  this,  when  the  orphan  read  is  is¬ 
sued.  it  is  converted  to  a  special  command  called  echo-response,  whicli 
is  sent  back  to  the  local  cluster.  The  UIC  in  the  local  cluster  stalls 
any  commands  on  the  local  bus  until  the  echo-response  arrives,  thus 
guaranteeing  that  the  orphan  read  command  is  destroyed. 

The  corrected  model  satisfies  the  absence  of  deadlock  specification. 
The  performance  of  the  SMV  model  checker  in  verifying  this  is  plotted 
in  figure  4.7.  for  a  model  with  2  clusters,  as  the  number  of  caches  in 
each  cluster  is  increased  from  2  to  6  (thus,  in  the  largest  model,  there 
are  12  caches  and  4  UICs).  Part  (a)  shows  the  run  time  as  a  function  of 
the  number  of  caches  per  cluster.  Part  (b)  shows  the  number  of  OBDD 
nodes  used  overall,  and  for  representing  the  transition  relation.  Part 
(c)  shows  the  number  of  reachable  states  of  the  model.  Although  the 
run  time  points  are  well  fit  by  a  quadratic  curve,  the  actual  asymptotic 
performance  is  most  likely  cubic,  as  in  the  case  of  the  synchronous 
arbiter  (cf.  section  2.4.1).  owing  to  linear  increases  in  the  transition 
relation  size,  the  number  of  fixed  point  iterations  and  the  size  of  the 
OBDDs  representing  fixed  point  api)roximations. 

Since  the  number  of  bus  wires  running  between  successive  caches 
is  fixed,  we  can  apply  theorem  7  to  show  that  the  transition  relation 
OBDD  size  must  grow  linearly  in  the  number  of  caches.  The  fact  that 
the  fixed  point  approximation  OBDDs  also  grow  linearly  bears  further 
examination,  however.  This  phenomenon  can  be  understood  by  consid¬ 
ering  the  nature  of  the  protocol.  Imagiiu'  cutting  a  cluster  bus  in  half, 
and  consider  how  much  information  mu.st  be  communicated  from  one 
half  of  the  bus  to  determine  whether  a  given  state  of  the  system  is  in 
the  reachable  .set  or  not.  In  fact,  this  amount  is  fixed,  independent  of 
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the  number  of  caches  on  tlie  bus,  since  we  need  only  know  if  there  are 
any  caches  in  the  shared  state  or  the  owned  state  on  the  other  side  of 
the  cut,  and  not  in  particular  which  caches  these  are  or  how  many.  As  a 
result,  the  number  of  OBDD  nodes  (representing  the  reached  state  set) 
at  the  level  corresponding  to  our  cut  is  bounded."  This  is  characteristic 
of  bus  snooping  protocols,  and  other  protocols  which  are  “loosely  cou¬ 
pled”,  in  the  sense  that  one  half  of  the  system  has  bounded  knowledge 
of  the  state  of  the  other  half  of  the  system. 

.As  part  (c)  of  the  figure  shows,  the  number  of  states  of  the  system 
increases  e.'cponentially  with  the  number  of  caches  per  cluster.  De¬ 
spite  this,  the  performance  of  the  symbolic  model  checking  algorithm 
is  polynomial.  Thus,  for  this  particular  model  and  specification,  we 
have  solved  the  state  explosion  problem. 


4.2.3  Sequential  consistency 

When  writing  a  formal  specification  for  the  Gigamax  cache  consistency 
protocol,  we  need  to  consider  the  model  of  a  distributed  memory  which 
the  Gigamax  provides  to  the  programmer.  .As  mentioned  previously,  for 
performance  reasons  the  protocol  does  not  maintain  strict  consistency 
of  the  caches.  .A  cache  block  in  the  shared  state  may  be  out  of  date 
for  a  short  time  while  an  invalidate  message  is  traversing  the  network. 
This  is  tolerated,  since  maintaining  strict  consistency  would  require  an 
acknowledgment  of  invalidation  to  be  collected  from  all  caches  in  the 
shared  state  before  a  cache  block  could  be  modified. 

There  are  a  number  of  rlistributed  memory  models  that  may  be 
supported  by  such  a  system.  .A  totally  ordered  model  is  one  in  which 
all  processors  observe  all  values  written  to  the  memory  in  the  same 
order.  For  example,  in  a  totally  orflered  model,  if  the  processors  write 

into  a  location  the  secpience  of  values  1,2..} .  then  all  processors 

which  read  the  location  will  observe  any  new  values  to  be  greater  than 
or  equal  to  all  previous  values.  We  will  show  that  the  Gigamax  protocol 
has  this  property,  for  a  one  block  system.  In  a  partially  ordered  model, 
values  written  may  in  some  cases  be  observed  in  a  different  order  bv 
different  processors.  Some  guarantee  of  ordering  is  usually  made.  For 

-For  other  applications  of  this  kiiul  of  argument,  see  [Bryf)l). 


Reachable  states  OBDO  nodes  used  Execution  time  (secs) 
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Figure  4.7:  Pertorinance  tor  cluTkiiig  fleadlock 
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example,  all  writes  must  be  observed  in  the  same  order  relative  to 
special  synchronization  operations,  or  all  writes  by  the  same  processor 
must  be  observed  in  the  same  order.  The  latter  model  is  supported 
by  the  Gigameix  protocol  for  writes  to  different  cache  blocks.  Since 
our  model  of  the  protocol  only  describes  the  behavior  one  cache  block, 
however,  the  model  cannot  be  used  to  check  this  property. 

Returning  to  the  problem  of  total  ordering  of  writes  to  the  same 
block,  it  might  seem  at  first  that  there  is  no  ‘’finite  state”  description 
of  a  protocol  that  writes  an  unbounded  sequence  of  values.  We  can 
check  the  property,  however,  by  using  an  abstraction.  We  do  this  by 
choosing  a  value  n,  and  storing  in  the  model  only  one  bit  of  information 
-  whether  the  data  value  is  less  than  n  or  greater  than  or  equal  to  n. 
We  then  assume  that  the  processors  never  write  a  value  less  than  n 
after  a  value  greater  than  n  has  b<?en  written,  and  we  show  that  a 
processor  never  reads  a  value  less  than  n  after  reading  a  value  greater 
than  n.  Since  the  value  of  n  is  arbitrary,  it  follows  that  all  processors 
read  data  values  in  non-decreasing  order,  satisfying  the  total  ordering 
requirement.  We  now  consider  how  to  model  the  system  using  this 
abstraction.  For  each  cache,  we  introduce  a  variable  whose  value  is 
0  when  the  data  value  is  less  than  n  and  1  when  the  date  value  is 
greater  than  or  equal  to  n.  Tliis  variable  may  change  whenever  the 
block  is  writable,  but  may  only  change  from  0  to  1.  since  we  assume 
the  processors  only  increase  the  data  value.  The  following  SMV  code 
models  the  data  held  in  tlie  processor's  cache: 

MODULE  data-devicQ 
VAR 

data  :  boolean; 

ASSIGN 

next (data)  := 
case 

Imaster  k  waiting  &  CMD  in  {response, write-resp-invalid, 
write-resp-shared>  :  DATA; 

writable  :  data  union  1 ; 

1  ;  data ; 
esac; 

DEFINE 

data-enable  :=  master  k  CMD  in  {response, write-resp-invalid, 
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writa-resp-shared .write- invalid} ; 

Additionally,  we  introduce  variables  to  represent  the  values  on  the 
buses  and  the  values  in  the  high  priority  message  queues.  The  low 
priority  queues  hold  only  requests,  which  have  no  data  value. 

We  would  now  like  to  prove  that  this  abstract  model  of  the  data 
path  of  the  protocol  satisfies  the  following  specification  in  CTL.  for  all 
processors; 

A6'[(  readable  A  data  >  //)  =»  readablt  A  data  <  n)]  (4.2) 

In  other  words,  if  ever  a  value  greater  or  e(|nal  to  n  is  observed,  a  value 
less  than  n  is  never  observed  in  the  future.  We  can  check  this  using 
SMV  by  adding  the  following  specification  to  the  proces.sor  module; 


SPEC 

AGfreadable  ft  data  ->  AG  (readable  ->  data)) 

Figure  4.8  shows  the  performance  of  the  symbolic  model  checking 
algorithm  in  verifying  this  formula,  again  for  a  model  with  2  clusters. 
Part  (a)  of  the  figure  shows  the  execution  time,  while  part  (b)  shows 
the  amount  of  storage  used.  Notice  that  although  the  execution  times 
are  roughly  ten  times  those  obtained  for  the  model  without  data,  thev 
are  still  cubic  in  the  number  of  |)roc«'ssors  |)er  cluster. 


4.2.4  Correctness  of  diagnostics 

In  addition  to  the  above  specifications,  it  was  also  particularly  useful 
to  check  that  the  diagnostics  built  into  the  protocol  never  flagged  an 
error  under  normal  operation  of  the  protocol.  Errors  are  flagged  bv  the 
diagnostic  system  in  each  proce.s.sor  sulxsvslein  whenever  a  command 
is  observed  on  the  bus  which  is  inconsisl»-nt  with  the  processor's  local 
state.  Determining  which  command/stale  combinations  are  normal, 
and  which  are  errors  is  difficult,  and  a  number  of  errors  of  this  tvpe 
were  found  in  the  protocol  using  the  model  checking  techniipie. 


OBDD  nodes  used  Execution  time  (secs) 
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Figure  1.8;  Performance  for  checking  sequential  consistency 
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4.3  Evaluation 

In  verifying  the  Gigamax  model  with  respect  to  the  formal  specifi¬ 
cations,  the  symbolic  model  checker  was  able  to  perform  an  exhaus¬ 
tive  secirch  of  the  model’s  state  space  without  explicitly  constructing 
the  global  state  graph.  As  a  result,  the  state  explosion  problem  was 
avoided.  In  addition,  the  model  checker  exposed  a  number  of  subtle 
errors  in  the  design  that  were  not  found  in  simulation.  These  errors 
were  usually  caused  by  events  (eg.,  caclie  misses  and  message  arrivals) 
occurring  out  of  the  normal  sequence  anticipated  by  the  designers.  This 
type  of  error  is  difficult  to  find  in  random  simulations,  since  the  prob¬ 
ability  of  a  given  sequence  of  random  events  occurring  by  pure  chance 
is  in  inverse  exponential  proportion  to  the  length  of  the  sequence.  .-\s 
we  have  seen,  the  sequences  necessary  to  produce  protocol  errors  can 
be  quite  long.  As  the  design  evolved  to  correct  the  errors  found  by 
model  checking,  the  model  was  easily  adapted,  and  quickly  provided 
an  analysis  of  any  new  errors  introduced  by  desigti  changes.  This  tends 
to  amortize  the  initial  effort  requiretl  to  produce  the  protocol  model. 
The  ability  of  the  symbolic  model  checker  to  find  errors  quickly  makes 
it  easier  to  experiment  with  alternative  designs,  and  also  heli)s  to  build 
the  designer's  intuition  about  the  l)ehavior  of  the  system.  This  is  im¬ 
portant.  because  designers  tend  to  concentrate  on  normal  setiuences  of 
events,  and  overlook  the  unusual  sequences.  The  use  of  OBDDs  in  the 
symbolic  model  checker  made  it  possible  to  check  a  model  that  woidd 
have  been  very  time  consuming,  or  perha|)s  imjjossible  to  check  usini* 
earlier  algorithms. 

.At  this  point,  the  techni<|ue  has  a  number  of  limitations.  ()ne  lim¬ 
itation  is  the  u.se  of  OBDDs.  For  exami)le.  while  we  find  the  OBDD 
sizes  growing  polynomially  in  the  number  of  caches  in  the  Chgamax 
model,  if  we  instead  increase  the  number  of  cache  blocks  and  leave  llu' 
number  of  caches  constant,  we  find  the  size  of  the  OBDDs  increasing 
exponentially.  .As  a  result,  it  was  extremely  difficult  to  check  specili- 
cations  of  a  system  with  just  two  cache  blocks  isome  runs  took  up  to 
a  week,  and  others  never  finished!.  In  tliese  case's,  the  size  of  tin*  OB¬ 
DDs  representing  the  fi.xeci  point  approximations  Ix'came  intract abi\ 
large.  When  this  happens,  techniepies  such  as  t'arly  (luanlificatie)n  tliai 
make  the  representation  of  the  transition  relatieui  smaller  ar<'  little  use-. 
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since  they  do  not  effect  the  size  of  the  OBDDs  representing  fixed  point 
approximations. 

Another  major  issue  is  implementation  of  the  protocol.  Clearly, 
verification  of  the  protocol  itself  is  important,  since  a  correct  protocol 
is  a  prerequisite  for  a  correct  implementation.  This  is,  of  course,  only 
half  the  story.  Techniques  are  also  needed  to  insure  that  the  verified 
protocol  is  implemented  correctly  in  hardware.  This  can  in  fact  be  done, 
using  a  process  of  successive  refinenient  of  finite  state  systems  that  has 
been  studied  extensively  by  Kurshan  [KurST].  The  work  of  Bose  and 
Fisher  [BF89a]  is  also  an  example  of  this.  Unfortunately,  the  truth 
of  CTL  formulas  containing  existential  quantifiers  is  not  necessarily 
preserved  by  this  kind  of  refinement.  Thus,  for  example,  though  the 
high  level  protocol  may  be  deadlock  free,  a  specific  implementation  of 
the  protocol  may  not  be  deadlock  frce\  In  order  for  the  implementation 
to  preserve  all  CTL  properties  of  the  protocol,  the  two  would  have  to 
be  bisimular  (cf.  section  2.6.1).  Since  this  is  a  very  strong  requirement, 
it  is  not  clear  that  the  protocol  could  in  fact  be  implemented  with 
this  degree  of  accuracy  in  an  efficient  way.  For  essentially  this  reason. 
Grumberg  and  Long  have  studied  the  use  of  a  subset  of  CTL  using 
only  universal  path  quantifiers  for  hierarchical  reasoning  [GL91].  In 
any  event,  though  checking  the  absence  of  deadlock  specification  was 
very  useful  in  finding  bugs  in  the  protocol,  we  must  attach  a  special 
caveat  to  this  result,  since  it  does  not  guarantee  that  all  reasonable 
implementations  of  protocol  will  be  deadlock  free. 

Finally,  there  is  the  problem  of  verifying  a  model  with  a  finite  num¬ 
ber  of  processors,  when  there  is  no  finite  limit  on  the  number  of  pro¬ 
cessors  that  could  in  principle  be  added  to  the  system.  In  practice,  the 
intended  maximum  number  of  processors  is  approximately  LOO.  Even 
using  the  symbolic  model  checking  technique,  however,  checking  a  sys¬ 
tem  of  100  processors  seems  infeasible  at  present,  and  1000  processors 
is  out  of  the  question.  To  deal  with  systems  with  a  very  large  num- 

'^In  fact,  such  a  deadlock,  involving  an  interaction  between  tlie  memory  and 
processor  subsystems  was  known  to  the  Encore  engineers.  Tlie  memory  system, 
when  busy,  would  stall  any  new  r<'(|uests.  but  the  stalled  request  wonltl  still  remain 
in  the  memory  system's  pipeline  for  four  clock  cycles.  Thus,  when  the  proces.sor 
retried  the  request  four  clock  cycle's  later,  it  would  be  stalled  again,  and  the  process 
would  repeat  indefinitely. 


154  CHAPTER  4.  A  DISTRIBUTED  CACHE  PROTOCOL 

ber  of  identical  components,  we  can  apply  methods  of  induction  over 
processes.  As  in  the  case  of  successive  refinement,  induction  methods 
are  not  fully  automatic  -  some  human  input  is  required  in  the  form 
of  an  inductive  hypothesis.  In  the  next  chapter,  we  will  deal  with  the 
problem  of  induction  over  processes. 


Chapter  5 

Induction  and  model 
checking 

This  chapter  deals  with  the  verification  of  systems  that  have  an  ar¬ 
bitrary  number  of  similar  components,  arranged  in  some  inductively 
defined  structure.  Systems  of  this  type  are  commonplace  -  they  oc¬ 
cur  in  bus  protocols  and  network  protocols.  I/O  channels,  and  many 
other  structures  that  are  designed  to  be  e.xtensible  by  adding  similar 
components,  .-\fter  using  a  model  checking  system  to  determine  the 
correctness  of  a  system  configured  with  a  fixed  number  of  processors  or 
other  components,  it  is  natural  to  ask  whether  this  number  is  enough  in 
some  sense  to  represent  a  system  with  any  number  of  components.  For 
example,  a  Gigamax  system  can  be  built  by  connecting  some  arbitrary 
number  of  cluster  buses  to  a  global  bus.  then  filling  each  cluster  bus 
with  an  arbitrary  number  of  processor  cards.  It  is  practically  impossible 
to  verify  using  model  checking  methods  alone  that  all  possible  config¬ 
urations  of  the  system  satisfy  the  specifications,  even  given  a  physical 
bound  on  the  number  of  cards  in  a  backplane.  However,  by  supplying 
an  appropriate  inductive  hypothesis,  we  can  in  many  cases  reduce  the 
problem  of  verifying  a  system  of  arbitrary  size  to  one  of  verifying  a 
system  of  fixer!  size.  The  inductive  hypothesis  can  take  the  form  of  a 
finite  state  process. 


156 


CHAPTER  5.  INDUCTION  AND  MODEL  CHECKING 


5.1  The  general  framework 

Induction  over  systems  of  processes  can  be  put  in  a  fairly  general  frame¬ 
work.  which  is  independent  of  the  mechanics  of  the  process  model,  rely¬ 
ing  only  on  certain  algebraic  properties  of  the  operators  for  combining 
processes.  Let  us  assume  that  we  have  a  collection  of  processes,  and  a 
collection  of  operators  acting  on  processes.  In  a  typical  process  model, 
we  have  some  form  of  parallel  composition  operator,  some  form  of  opei  - 
ator  for  renaming  signals,  and  perhaps  a  hiding  operator,  which  makes 
a  given  signal  invisible  to  the  outside.  The  e.xart  choice  of  operators 
is  not  material  here,  however.  VVe  re(|uire  only  that  the  operators  be 
monotonic  with  respect  to  a  reflexive  transitive  relation  <  on  processes. 
The  idea  of  this  order  is  that  if  />  <  </.  the  p  is  in  some  sense  more  sp<'- 
cific.  or  more  deterministic,  than  q.  The  properties  we  wish  to  verify 
should  be  preserved  as  we  descend  the  order. 

.As  an  example  of  induction  on  proress(>s.  suppose  we  have  a  parallel 
composition  operator  j|  on  processes,  which  is  monotonic  with  respect 
to  a  pre-order  <.  In  this  case,  we  can  apply  the  following  induction 
rule: 

/>  <  n 

(fW  P<ff 
P\\--  -  \\P<<1 

Think  of  the  inequalities  p  <  q  and  q  \\  p  <  q  as  substitution  rules. 
If  p  <  q.  we  can  safely  substitute  p  for  anv  occurrence  of  q  in  a  given 
term,  in  the  sense  that  we  will  only  make  th<’  term  lesser  in  the  partial 
order.  Thus,  we  can  always  substitute  p  for  </  on  the  les.ser  side  of  an 
inequality.  For  example,  if  q  ||  p  <  «/.  w»'  have 

'/  il  P  <  '/ 

i<l  il  P)  il  P  <  'I 
((</ II /»•  11 /d  II /'  <  q 


If  /)  <  (j.  we  can  substitute  p  for  </.  giving  us  p  j|  ■  J  p  ^  (/.  \\c 
call  7  a  proct.'is  invariant.  Other  induction  rules  can  be  !>eneratt>d. 
based  on  other  substitutions.  For  exami)le.  assume  we  have  a  parallel 
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q  ^  q  —  r  ^  q  —  r—  r  ^  •  •  • 

I  I  I 

P  P  P 

Figure  5.1:  Processes  generated  by  safe  substition 

composition  operator  |(  and  a  renaming  operator  0,  both  monotonic 
with  respect  to  <.  Then  we  have 

_ oi(l)  II  '•  II  P  <  <? _ 

<z>(®(  •••  )  II  II  P)  11  r  II  p  <  <•/ 

Given  a  collection  of  substitution  rules,  we  can  inductively  generate  a 
class  of  processes  from  any  process  q.  For  example,  figure  5.1  depicts 
the  first  few  processes  in  the  class  generated  by  the  above  induction 
rule.  Every  process  in  the  class  is  smaller  than  q  in  the  partial  order. 
Thus,  any  properties  of  q  which  are  preserved  as  we  descend  the  partial 
order  are  inherited  by  all  the  processes  in  the  class.  The  key,  then,  is 
to  choose  a  partial  order  that  preserves  the  properties  we  are  interested 
in  verifying.  The  most  straightforward  way  to  do  this  is  to  choose  a 
class  of  properties  that  we  wish  to  preserve,  and  then  define  the  partial 
order  accordingly. 

For  example,  suppose  we  wish  to  preserve  all  properties  expressible 
in  the  logic  CTL.  In  this  case,  the  partial  order  we  obtain  is  a  degenerate 
one,  which  partitions  the  Kripke  models  into  a  set  of  incomparable 
equivalence  classes.  To  see  this,  assume  towards  a  contradiction  that 
p  satisfies  every  (.'TL  formula  satisfied  by  q.  and  there  is  some  formula 
/  satisfied  by  p  but  not  by  </.  In  this  case,  it  follows  that  q  satisfies 
-f.  This  implies,  however,  that  p  satisfies  “"/.a  contradiction.  Hence, 
if  P  ^  q,  then  p  and  q  satisfy  the  same  set  of  CTL  formulas.  Since 
(.’TL  characterizes  Kripke  models  up  to  bisimulation  [BCGS7].  it  follows 
that  p  and  q  are  bisimular.  This  is  unfortunate,  since  we  do  not  want 
our  induction  framework  to  apply  only  to  classes  of  Kripke  structures 
that  are  equivalent.  In  general,  we  would  like  to  treat  systems  whose 
l)ehavior  becomes  more  specific  as  we  add  processes  to  the  system. 

One  way  to  rlo  this  is  to  ii.se  a  subset  of  the  logic.  For  example, 
suppose  we  choose  to  preserve  those  formulas  which  use  only  universal 


158 


CHAPTER  5.  INDUCTION  AND  MODEL  CHECKING 


path  queintifiers.  This  subset  is  called  V-CTL  [GL91].  A  formula  in  CTL 
is  also  in  V-CTL  if  driving  the  negations  in  to  the  literals  results  in  a 
formula  without  the  E  path  quantifier.  Examples  of  V-CTL  formulas 
are 


AG~'EGp  =  AGAF-'p 
-'EGEXp  =  AFAX^p 

Examples  of  CTL  formulas  which  are  not  in  V-CTL  are 

AG-'AFp  =  AGFG-'p 
-^EGAXp  =  AFFX^p 

Clearly,  if  a  formula  /  contains  path  quantifiers,  then  /  and  /  cannot 
both  be  in  V-CTL.  Grumberg  and  Long  [GL91]  have  shown  that  if  p 
satisfies  every  V-CTL  formula  satisfied  by  7.  then  7  simulates  p.  and 
conversely.  Simulation  is  easily  shown  to  l)e  reflexive  and  transitive. 
Thus  simulation  is  a  pre-order  suitable  for  inductive  proofs  of  V-CTL 
formulcis.  Let  p  <  7  iff  7  simulates  p.  This  gives  us  the  following 
induction  rule: 

9  1=  /  (/€  V-CTL) 

P  <  <1 

<l  II  P  <  <! 

P  II  •  •  •  II  P  ./■ 

as  well  as  other  rules  engendered  by  various  systems  of  safe  substitu¬ 
tions.  Recall  from  section  2.6.1  that  simulation  is  the  greatest  relation 
between  the  states  of  7  anti  the  states  of  p  such  that  if  ,r  simulates  7. 
then: 

1.  .r  and  //  agree  on  the  atomic  propositions,  and 

2.  every  successor  of  y  is  simulated  by  a  successor  of  .r. 

.A  Kripke  model  7  simulates  p  if  every  initial  state  of  p  is  simulated  in 
some  initial  state  of  7.  Since  this  relation  can  be  expressed  as  a  greatest 
fixed  point  in  the  .\Iu-(.'alculus.  it  can  be  verilierl  automatically  usinu 
the  symbolic  morlel  checking  leclini(iue.  The  fact  that  simulation  is 
not  symmetric  allows  us  more  flexibilitv  in  constructing  systems  usim> 
substitution  rules  than  we  would  have  using  bisimulation. 
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5.1.1  Induction  in  other  models 

We  can  set  up  an  induction  framework  for  a  variety  of  models  by  es¬ 
tablishing  a  pre-order  and  a  set  of  monotonic  process  operators.  For 
models  of  concurrent  automata  (such  as  the  s/r  model  [Kur85]),  the 
natural  partial  order  is  language  containment. 

In  the  s/r  model,  a  process  is  an  automaton  which  accepts  infinite 
strings  over  a  Boolean  algebra.  There  are  two  natural  operators  over 
this  class  of  processes.  The  automaton  product  operation  simulates 
parallel  execution,  while  Boolean  algebra  homomorphisms  can  be  used 
to  induce  a  renaming  or  abstraction  of  the  variables  by  which  processes 
communicate.  Kurshan  shows  that  both  of  these  operations  respect 
the  relation  of  language  containment  between  automata  [Kur86].  .\.n 
example  of  induction  in  this  framework  can  be  found  in  [KM89]. 

Induction  can  also  be  applied  in  process  algebras  like  CCS  [Mil80] 
which  are  based  on  two-way  synchronization.  In  this  case,  there  is  a  va¬ 
riety  of  plausible  process  relations,  including  observational  equivalence, 
weak  observational  erpiivalence.  and  a  number  of  pre-order  relations  on 
processes.  .\n  induction  example  using  the  “may  ’  pre-order  for  CCS 
processes  can  also  be  found  in  iK.\l89]. 


5.2  Induction  and  SMV 

.\n  induction  framework  can  be  set  up  for  the  .SMV.O  language,  based 
on  either  simulation  or  bisimulation.  This  framework  includes  two  kinds 
of  process  operators  -  the  parallel  composition  operator  |1  and  renam¬ 
ings  operators  based  on  maps  o  from  locations  to  locations.  We  will 
show  that  both  are  monotonic  with  resoect  to  simulation  and  bisimu¬ 
lation. 

5.2.1  Proving  compositionality 

Recall  that  semantically,  an  S.MV’.O  program  denotes  a  triple  {T.  I.  R). 
where  T  assigns  types  to  locations.  /  is  the  set  of  initial  states,  and 
R.  is  the  transition  relation.  There  are  two  basic  process  operators 
provifled  by  SMV;  instantiation  and  parallel  composition.  .\n  instanti¬ 
ation  results  from  a  map  o  on  locations  (a  renaming).  This  renaming 
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induces  a  map  $  from  states  of  <f>{M)  to  states  of  AI,  such  that  for 
all  locations  /,  ^{x){l)  =  x(^(I)).  If  A/  =  (T.l.R)  is  a  process,  then 
(f>{AI)  =  {rj',R')  where 

1.  Tim  =  T{1), 

2.  V  =  {.T  I  $(i)  €  /}  and 

3.  R'  =  {{x,y)\mx)My))eR}. 

The  rules  for  renaming  require  that  no  hidden  location  can  be  renamed 
to  a  visible  location,  ajid  no  two  distinct  locations  can  be  renamed 
to  the  same  hidden  location.  The  following  lemma  and  theorem  show 
that  this  definition  of  renaming  is  a  suitable  operation  for  inductive 
reasoning  using  simulation  or  bisimulation; 

Lemma  4  Let  S  be  a  legal  renaming,  and  let  <I>  be  the  state  map  induced 
by  (t).  If  x\  =  and  ;c',  agrees  with  on  the  visible  locations,  then 

there  exists  x^  which  agrees  with  .Ci  on  the  risible  locations,  such  that 

=  $(l2)- 

Proof.  Construct  ^2  as  follows:  For  every  location  /.  if  /  is  in  the 
range  of  (j>.  then  choose  any  /'  such  that  /  =  o(l').  and  let  .v^il)  = 
Otherwise,  let  X2(/)  =  xi{l). 

First  we  show  that  .Cj  =  For  all  locations  /".  if  0(1")  is  visible, 

then  I"  must  also  be  visible  (since  hidden  locations  cannot  legally  be 
renamed  to  invisible  locations).  Let  /  =  oil").  Since  /  is  in  the  range 
of  0.  there  is  some  visible  /'  such  that  /  =  o(/')  and  .r.>(/)  = 

Since  .r,  and  .fo  agree  on  the  visible  locations.  .r',(/')  =  .r\(l').  Since' 
.1-;  =  .v\{l’)  =  xi(/)  =  =  .v'Al").  Thus  x'.il")  =  .r.(o(r)). 

On  the  other  hand,  if  Ml")  is  hidden,  there  exists  no  other  location  /' 
such  that  0(1')  =  M^")'  since  two  locations  cannot  legally  be  renamed 
to  the  same  hidden  location.  Therefore  x2(o(l"))  =  Thus,  by 

definition  x'n  =  <I>(.C2)- 

Second,  we  show  that  .ii  and  .i  >  agree  on  th«'  visible  locations.  Let  / 
be  any  visible  location.  If  /  is  not  in  the  range  of  o.  then  .r  i  ( 1)  =  .rjf  1)  Itv 
construction.  Otherwise,  there  is  an  /'  siuh  that  oil')  =  I  aiul  = 

,c'^(/').  Since  !'  must  be  visible.  =  .c'i/')  =  .ciiol/'))  =  .ri(/).  Z 
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Theorem  9  Instantiation  ofSMV.O  modules  is  monotonic  with  respect 
to  simulation  and  bisimulation. 

Proof.  Imagine  that  we  have  two  processes,  AIi  and  Mo,  such  that 
M2  simulates  Mi. 

First,  let  ii  and  X2  be  states  ol  (t){Mi)  and  c>{Mi)  respectively.  We 
show  by  induction  that  if  xi  and  xi  agree  on  the  value  of  all  visible 
locations,  and  if  <&(x2)  simulates  then  .i-2. n-simulates  .ci,  for  all 

a. 

The  basis  case  is  trivial,  since  xi  0-simulates  xo  e.xactly  when  Xi 
and  X2  agree  on  the  value  of  all  visible  locations. 

For  the  induction  step,  let  [xi.iji)  be  any  transition  of  ©(Mi).  By 
definition,  (<^(xi),  $(t/i ))  is  a  transition  of  Mi.  Hence,  there  exists  a 
transition  (<&(x2),tf2)  simulates  y'^.  .Since  ^(yi) 

simulates  y^,  <^(yi)  and  agree  on  the  visible  locations.  By  the  lemma, 
there  must  exist  yo  which  agrees  with  1/1  on  the  visible  locations,  such 
that  y.j  =  ^(y2).  By  inductive  hypothesis,  yj  (n  —  l)-simulates  ^1, 
therefore  X2  n-simulates  x,. 

Now  we  show  that  every  initial  state  of  ©(Mi)  is  simulated  by  some 
initial  state  of  ©(M2).  state  .I'l  is  initial  in  ©(Mi)  e.xactly  when 
<^(xi)  is  initial  in  Mi.  Let  x'l  =  <&(xi).  If  .c,  is  initial  in  Mi.  then  it 
is  simulated  by  some  .vfj  which  is  initial  in  .VA.  Since  Xj  is  simulated 
by  .r',.  they  agree  on  the  values  of  the  visible  locations.  Hence,  by  the 
lemma,  there  exists  x,  which  agrees  with  ,ei  on  the  visible  locations, 
such  that  x',  =  <&(x2).  By  the  above  argument.  xi  is  simulated  by  .r>. 
Therefore.  ©(.Vfi)  is  simulated  by  o(.V/>). 

We  can  prove  that  renaming  respects  bisimulation  l)y  the  same  ar¬ 
gument.  applied  symmetrically.  □ 

The  parallel  composition  of  S.MV.O  programs  is  formed  by  taking 
the  union  ot  the  type  functions  T  and  the  intersections  of  the  initial  sets 
I  and  the  transition  relations  R..  after  renaming  the  hidden  variables 
of  each  process  onto  disjoint  spaces.  We  can  show  that  this  operation 
is  also  suitable  for  inductive  reasoning: 

Theorem  10  Parallel  composition  of  SMV.O  proijrams  is  monotonic 
with  respect  to  simulation  ami  hisimulation . 
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Proof.  Imagine  that  we  have  four  processes,  Mi,  M2,  M{  and 
such  that  Ml  is  simulated  by  A/j  and  M[  is  simulated  by  A/^. 

Let  0  and  <!>'  be  the  renamings  associated  with  parallel  composition. 
These  are  identities  over  the  visible  locations,  and  map  the  hidden 
locations  onto  disjoint  ranges.  Let  $  and  be  the  induced  state  maps. 
(Thus,  given  a  state  of  a  parallel  composition  M  ||  A/',  $  yields  the 
corresponding  state  of  A/  and  yields  the  corresponding  state  of  A/'.) 
Let  xi  be  a  state  of  Mi  [(  A/(  and  let  .rj  be  a  state  of  A/2  ||  A/^.  We 
show  by  induction  that  if  ^(xi)  is  simulated  by  ^>(;r2)  and  is 

simulated  by  $'(.r2),  then  .Ti  n-simulates  X2.  for  all  n. 

For  the  base  case,  since  o  is  the  identity  for  the  visible  locations. 
$(xi)  agrees  with  xi  on  the  visible  locations,  as  does  ^(x>)  with  x^. 
Since  ^(xi)  is  simulated  by  <^(.r>).  they  also  agree,  therefore  Xj  0- 
simulates  X2. 

For  the  induction  step,  let  «i  =  ^(xi),  =  ^'(xi),  uy  =  ^(xj). 

=  $'(x2).  Let  (xi,yi)  be  a  transition  of  A/i  \\  M{.  and  let  t>i  =  $(t/i ) 
and  v[  =  By  definition.  («i.  eO  is  a  transition  of  A/i  and  (a\.  t/j ) 

is  a  transition  of  A/(.  Since  ui  is  simulated  by  uz  and  u\  is  simulated  by 
Uj,  there  must  exist  ^2  and  I’i  such  that  ((/i-t-’’)  is  a  transition  of  A/j. 

*8  *  transition  of  M^.  t’l  is  simulated  by  ^2  and  c[  is  simulated 
by  I’j.  Now  we  construct  ijz-  For  all  visible  locations  1.  let  yziU  =  Uiih- 
For  all  hidden  locations  I  in  the  range  of  o.  there  is  a  unique  I'  such  that 
/  =  <z>(/'),  since  a  renaming  cannot  legally  map  distinct  locations  on  to 
the  same  hidden  location.  Let  i/ifO  =  >'2(n-  Similarly,  for  all  hidden 
locations  I  in  the  range  of  o',  there  is  a  uidc|iie  /'  such  that  /  =  o'(/'). 
Let  yzU)  =  By  this  construction,  cj  =  <^(^2)  and  c',  = 

Hence,  by  inductive  hypothesis,  xy  (;/  —  1  )-simulates  1/2.  By  deHnition. 
(x2,!/2)  is  a  transition  of  Mz  ||  .U(.  Therefore  .i  i  /i-simulates  .i  >. 

Now  we  show  that  every  initial  state  of  .V/i  ||  .V/|  is  simulated  In 
an  initial  state  of  Mz  ||  A/T  Let  xi  be  an  initial  state  of  A/i  |(  A/,'  and 
let  <ii  =  ^>(xi),  a'j  =  <^'(xi).  By  definition.  (/i  is  initial  in  .\/i  and  a', 
is  initial  in  .V/J.  Hence  there  exist  uz  and  a',  such  that  is  simulated 
by  uo.  n\  is  simulated  by  a',,  aj  is  initial  in  A/.>  and  a',  is  initial  in 
A/T  We  can  construct  xz  such  that  a..  =  and  a',  =  <!>'(. rji  in  th<' 

same  manner  as  we  constructed  i/j  above.  By  the  above  argument.  .C] 
is  simulated  bv  .r  >. 
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We  can  prove  that  parallel  composition  respects  bisimulation  by  the 
same  argument,  applied  symmetrically.  □ 


5.2.2  Computing  simulation  relations 

Since  the  simulation  and  bisimulation  relations  can  be  expressed  in 
the  Mu-Calculus  (cf.  section  2.6.1).  they  can  be  computed  using  the 
symbolic  model  checking  technique.  In  this  way,  we  can  automatically 
test  whether  substituting  a  given  module  p  for  another  module  q  is  safe, 
in  the  sense  of  preserving  all  CTL  or  V-CTL  properties. 

There  are  a  few  techniciues  that  can  improve  the  efficiency  of  this 
process.  The  simplest  is  to  note  that  simulation  between  two  states 
implies  that  they  agree  on  the  values  of  the  visible  locations.  There¬ 
fore,  there  is  no  need  to  use  s<;parate  OBDD  variables  to  encode  the 
visible  locations  of  the  two  processes  when  representing  the  simulation 
relation.  As  with  CTL  model  checking,  we  can  compute  the  reach¬ 
able  state  space  of  the  two  programs,  and  use  these  sets  to  restrict  the 
computation  of  the  erpiivalence  relation.  In  cases  where  the  simula¬ 
tion  relation  cannot  be  computed,  we  can  instead  compute  a  stronger 
relation  between  the  programs,  which  requires  that  all  pairs  of  states 
which  are  simultaneously  reachable  (reachable  along  paths  which  agree 
on  the  visible  locations)  are  i-simular.  This  relation  can  be  tested  by 
a  forward  search  of  the  reachable  state  space  of  the  composition  of  the 
two  programs.  In  the  case  of  deterministic  programs  (in  which  no  two 
successors  of  a  given  state  agree  on  all  of  the  visible  locations),  this 
amounts  to  a  test  of  string  language  containment  [GL91].  In  either 
approach,  if  the  test  fails,  we  can  extract  as  a  countere.xample  a  pair 
of  paths,  such  that  all  corresponding  states  are  0-simular.  and  the  last 
pair  fails  to  be  l-simular.  This  test  can  be  used  to  formulate  another 
guess  for  the  process  invariant,  until  a  sound  invariant  is  found. 

SMV  supports  induction  in  the  following  way.  Each  hypothesis  of 
an  induction  rule  is  of  the  form  p  <  q,  where  p  and  q  are  modules. 
This  is  completely  general,  since  module  p  can  be  an  arbitrary  parallel 
composition  of  instances  of  other  modules.  By  inserting  the  declaration 
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in  module  q,  we  cause  the  SMV  model  checker  to  test  whether  q  simu¬ 
lates  p  and  if  not,  to  produce  a  counterexample. 

5.2.3  Induction  and  SMV.l 

Is  it  possible  to  extend  the  above  framework  to  SMV.l,  which  includes 
interleaving  processes?  Unfortunately,  the  answer  is  no.  Consider,  for 
example,  the  following  two  modules,  which  are  bisimular: 

MODULE  a 
VAR 

X  ;  boolean; 

ASSIGN 

init(x)  :=  0; 
next(x)  ;=  0; 

MODULE  b 

X  ;  boolean; 

ASSIGN 

init(x)  0; 
next(x)  ;=  x; 

Note,  however,  that  if  we  substitute  a  for  b  in  the  following  program, 
the  resulting  program  is  not  bisimular  to  the  original; 

NODULE  main 
VAR 

p  ;  process  b; 

ASSIGN 

next(p.x)  ;=  1; 

This  is  because  process  main  may  iiuervc'iie  between  slejts  of  process 
p.  changing  the  value  of  p.x  to  1.  In  this  state,  which  is  not  reachalde  in 
a  or  b  alone,  the  two  modules  have  dilFerent  behaviors.  Hence  parallel 
composition  in  SMV.l  does  not  resjject  iiisimulation  (neither  does  it 
respect  simulation).  This  problem  is  a  general  feature  of  languages  that 
support  interleaving  processes  with  sharecl  variables.  It  is  difficult,  for 
example,  to  formulate  a  compositional  rule  for  the  Ifads-to  (j[)erator 
of  IfNTT^  logic  [CM8<S|.  For  this  reason,  we  will  use  otdv  tlu'  SMX’.l) 
subset  for  induction  over  processes. 
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Figure  5.2;  Substitution  generating  processors  on  bus 

5.3  Example:  The  Gigamax  protocol 

In  this  section,  we  formulate  a  safe  substitution  rule  that  generates  an 
arbitrary  number  of  Gigamax  processors  attached  to  a  cluster  bus.  As 
our  invariant,  we  will  use  a  single  processor  module  attached  to  the 
end  of  a  cluster  bus.  The  strategy  will  be  to  generalize  this  module  by- 
adding  nondeterminstic  choice  until  it  is  able  to  simulate  itself  with  one 
additional  processor  attached,  as  viewed  from  the  bus.  The  counterex¬ 
amples  produced  by  the  model  checker  will  provide  clues  as  to  how  the 
proposed  invariant  should  be  generalized.  .-Kfter  a  correct  invariant  is 
obtained,  we  can  use  this  invariant  to  prove  properties  of  the  protocol 
that  hold  independent  of  the  number  of  processors  on  a  cluster  bus. 

The  general  form  of  the  substitution  rule  we  use  is  depicted  in  fig¬ 
ure  5.2.  The  SMV  code  representing  the  left  hand  side  is  sho’vvn  in 
figure  figure  5.5.  and  the  code  for  the  right  hand  side  (the  invariant) 
is  shown  in  figure  5.1.  In  our  first  guess  for  the  invariant,  we  will  use 
the  original  processor  model  from  the  previous  chapter.  Our  a[)proach 
will  be  to  add  behaviors  i  u ..  non-determinism)  to  the  processor  niodel 
until  we  have  a  correct  invariant. 

Essentially,  we  are  testing  whether  one  processor  can  mimic  the  ac¬ 
tions  of  two  proces.sors  as  seen  from  the  bus.  Checking  this  produces  a 
counterexample  in  which  oiie  of  the  two  processors  reaches  the  owned 
state,  then  the  second  processor  issues  a  read  command.  This  behavior 
cannot  be  produced  by  a  single  processor.  To  fix  this  problem,  we  can 
modify  the  proce.ssor  mode!  so  that  a  proce.s.sor  is  allowed  to  issue  a 
read  command  in  the  owned  state.  It  then  sets  its  own  "snoop"  (lag. 
and  enters  the  shared  state  on  a  read-shareti.  anti  the  ownetl  stale  on 
a  read-owned.  Testing  this  new  invariant  produces  another  counterex- 
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MODULE  rule (CMD .REPLY-OWNED, REPLY-WAITING .REPLY-STALL , 

cmd , reply-owned , reply-waiting , reply-stall .master) 

VAR 

b  :  bus-connector(self ,p) ; 

p  :  processor(CMD,  REPLY-OWNED.  REPLY-WAITING,  REPLY - STALL) ; 
q  ;  invariant (CMD,  REPLY-OWNED,  REPLY-WAITING,  REPLY-STALL, 
b .cmd, b .reply-owned,b. reply-wait ing.b .reply-stall ,b  .master) ; 

MODULE  bus-connector (a, b) 

ASSIGN  b. master  :=  ! a. master  union  0; 

DEFINE 
cmd  :  = 
case 

a.  master  :  a. cmd; 

b.  master  :  b.cmd; 

1  :  idle; 

esac ; 

reply-owned  :=  a. reply-owned  I  b .reply-owned; 
reply-waiting  :=  a.reply-waiting  I  b . reply-waitingl ; 
reply-stall  :=  a . reply-stall  I  b . reply-stalll ; 
master  :=  a. master  I  b. master; 


Figure  0.3;  Substituion  rui<‘  for  iulding  one  processor 
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OPAQUE  MODULE  invariantCCMD, REPLY-OWNED, REPLY-WAITING, 

REPLY-STALL , cmd , reply-owned , reply-waiting , reply-stall , master) 
SIMULATES  rule 
VAR 

b  :  bus-connector(self  ,p) ; 

p  :  processor (CMD,  REPLY-OWNED,  REPLY-WAITING,  REPLY-STALL); 
t  :  bus-terminator(CMD,REPLY-OWNED,REPLY-WAITING, REPLY-STALL, 
b . cmd ,b . reply-owned ,b . reply-waiting ,b . reply-stall ,b . master) ; 

MODULE  bus-terminator (CMD .REPLY-OWNED , REPLY- WAITING , 

REPLY-STALL , cmd , reply-owned , reply-waiting , reply-stall , master ) ; 
ASSIGN 

CMD  ; *  cmd ; 

REPLY-OWNED  ;=  reply-owned; 

REPLY-WAITING  :=  reply-waiting; 

REPLY-STALL  :=  reply-stall; 


Figure  j.  1:  Tlie  invariant 


ample  in  which  the  first  processor  reaches  the  owned  state,  then  issues 
a  read  command  (thus  setting  its  snoop  and  waiting  bits),  then  the 
second  processor  issues  a  read  command.  One  processor  alone  cannot 
produce  this  behavior,  since  it  cannot  issue  a  second  read  command 
while  its  waiting  flag  is  set.  VVe  modify  the  processor  model  to  allow 
this  behavior.  .Note  that  this  is  behavior  is  safe,  since  the  second  read 
command  is  blocked  by  the  waiting  flag  which  is  already  set.  With  this 
modification,  we  have  a  correct  invariant. 

Using  this  invariant,  we  can  check  properties  of  the  system  in  V- 
CTL.  using  the  invariant  in  place  of  the  processors  on  the  cluster  buses. 
The  substitution  nde  can  be  applied  as  many  times  as  necessary  to 
produce  a  system  with  an  arbitrary  number  of  processors  while  pre¬ 
serving  all  of  the  verified  properties.  We  can  also  refer  these  properties 
back  to  our  original  nioflel  bv  showing  that  the  generalized  processor 
model  simulates  the  original  one.  In  order  to  verify  properties  such  as 
deadlock  freedom,  however,  which  u.se  e.xistential  path  quantifiers,  it 
would  be  necessary  to  [)rove  bisiinulation  rather  than  simulation.  I’his 
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means  that  it  would  not  be  possible  to  use  the  strategy  of  generalizing 
the  original  model  until  an  invariant  is  reached,  since  the  generalized 
model  would  not  be  bisumulation  equivalent  to  the  original  model. 


5.4  Related  research 

number  of  methods  have  been  proposed  in  the  past  for  e.xtending 
automatic  verification  to  parameterized  tlesigns  that  have  an  arbitrary 
number  of  similar  or  identical  processes. 

The  first  to  approach  this  question  were  Browne.  Clarke  and  Gruni- 
berg  [BCG86|,  who  extended  the  logic  CTL  to  a  logic  called  indexe.il 
CTL.  This  logic  allows  the  restricted  use  of  process  quantifiers  as  in 
the  formula  Vi/(0.  which  means  that  the  formula  /  holds  for  some 
process  i.  Restricting  the  u.se  of  these  quantifiers  and  eliminating  the 
next-time  operator  makes  it  impossible  to  write  a  formula  which  can 
distinguish  the  number  of  processes  in  a  system.  By  establishing  an 
appropriate  equivalence  between  a  system  with  n  processes  and  a  sys¬ 
tem  with  n  -f  1  processes,  one  can  guarantee  that  all  systems  satisfy 
the  same  set  of  formulas  in  the  indexed  logic.  This  method  was  used  to 
establish  the  correctness  of  a  mutual  exclusion  algorithm  by  exhibiting 
a  bisimulation  relation  between  an  n- process  system  and  a  2- processes 
.system,  and  applying  model  checking  to  the  2- process  system. 

.•\  disadvantage  of  the  indexed  CTL  method  is  that  the  bisimulation 
relation  must  be  proved  in  an  ad  hoc  manner.  Finite  state  methods 
cannot  be  used  to  check  it  because  it  is  a  relation  between  a  finite- 
state  proce.ss  and  a  process  with  an  arbiirarv  number  of  stales.  Clarke 
and  Grumberg  dealt  with  the  problem  of  establishing  a  bisimulaiion  In 
introducing  the  notion  of  a  procf>s  (:lo:^an  /'".  This  |)rocess  must  i)e 
derived  by  hand,  and  have  the  property  that  Mr  1|  P'  is  equivalent  to 
A/r+i  II  P*  for  some  small  r.  This  can  l>e  verified  mechanically.  Shtadh-r 
and  Grumberg  took  this  notion  a  st«‘p  further  by  introducing  ndivork 
(jrammars  to  describe  classes  of  finite  slate  systems.  This  lechni(|U(' 
used  an  indexed  form  of  linear  temporal  logic,  and  required  that  the 
processes  on  the  left  and  right  haiul  sides  of  each  grammar  rule  be 
ecjuivalent  in  an  appropriate  sense. 

The  requirement  that  all  systems  generaie<l  by  the  grammar  l)e 
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equivalent  seems  to  be  a  rather  strict  limitation,  however.  The  method 
of  this  chapter,  which  uses  a  partial  order  rather  than  an  equivalence, 
was  first  proposed  by  Kurshan  and  McMillan  [KM89],  and  simulta¬ 
neously  by  VVolper  and  Lovinlosse  [VVL89].  Around  the  same  time, 
Burch  was  also  applying  a  similar  idea  to  Dill’s  trace  theory  for  speed- 
independent  circuits. ‘ 

Another  method  for  proving  properties  of  systems  of  identical  pro¬ 
cesses  is  due  to  German  an<l  Sistla  [GS].  It  uses  a  linear-time  temporal 
logic  for  specifications  (again,  the  ae.xt-time  operator  is  not  allowed) 
and  is  fully  automatic.  By  meai\s  of  a  distinguished  ‘‘control”  process, 
it  is  possible  to  check  some  global  properties  (although  process  quanti¬ 
fiers  are  not  present  in  the  logic).  Unfortunately,  because  the  decision 
algorithm  is  doubly  e.xponential  in  the  process  size,  this  method  has 
not  been  applied  in  practice. 

.\  system  called  GOR.MKL  has  been  created  by  .Vlarelly  and  Grum- 
berg,  implementing  the  techniques  of  (SG89].  GORMEL  uses  context 
free  grammars  to  describe  systems  of  processes.  This  is  fairly  .similar  to 
the  use  of  module  substutution  rules  in  SMV.  There  are  a  number  of  dif- 
terences  between  the  systems,  however.  GORMEL  is  oriented  towards 
verification  of  distributed  algorithms,  ft  uses  a  model  of  transition  sys¬ 
tems  with  pairwi.se  synchronized  actions.  a.s  in  CCS.  This  model  is  not 
well  suited  for  describing  digital  systems  -  first  because  most  signals  in 
hardware  are  broadcast  to  more  than  one  location,  and  second  because 
many  signals  are  exchanged  i)ack  and  forth  between  components  of  a 
system  in  a  single  clock  cycle.  I'he  difficulty  of  reducing  this  two  way 
exchange  of  many  signals  to  a  single  atomic  action  would  make  it  ex¬ 
tremely  cumbersome  to  create  a  CCS-like  model  for  a  system  like  the 
Gigamax. 

•Another  difference  is  in  the  logic  -  GOR.MEL  uses  an  indexed  ver¬ 
sion  of  LTL  without  next-time  called  LTL'.  .As  in  inde.xed  (.'TL.  it  is 
not  possible  to  nest  process  f[uantifiers.  Because  of  the  ability  to  use 
process  quantifiers,  it  is  possible  to  express  some  properties  which  are 
not  expressible  in  (.'TL.  for  example  that  if  a  proposition  p  is  true  in 
some  process,  then  it  is  eventually  true  in  all  processes. 

For  the  process  relation.  GOR.MEL  uses  a  form  of  stuttering  erpiiv- 

•  Personal  communication 
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alence  rather  than  simulation.  This  places  fairly  strong  requirements 
on  the  allowable  grammar  rules.  In  particular,  it  is  not  possible  in  such 
a  system  to  take  the  approach  taken  here  of  successively  generalizing  a 
component  process  in  order  to  obtain  an  invariant,  since  the  required 
relation  between  the  left  and  right  hand  sides  of  the  grammar  rule  is  a 
symmetric  one.  The  GORMEL  approach  will  work  if  the  various  sys¬ 
tems  generated  by  the  grammar  can  be  distinguished  only  by  stuttering 
(arbitrary  repetition  of  the  same  state  labeling). 

A  final  difference  between  the  systems  is.  of  course,  that  SMV  is 
based  on  symbolic  model  checking  methods.  This  is  not  clearly  an 
advantage,  however,  since  the  state  explosion  problem  may  not  be  very 
severe  for  the  small  number  of  processes  that  tend  to  be  involved  in 
induction  rules. 


Chapter  6 

A  partial  order  approach 


[n  this  chapter,  we  consider  an  alternative  to  the  symbolic  model  check- 
;ng  method  which  is  also  aimed  at  avoiding  the  state  explosion  prob¬ 
lem.  A  number  of  researchers  have  observed  that  the  arbitrary  in¬ 
terleaving  of  concurrent  actions  is  a  major  contributor  to  the  state 
explosion  problem,  and  that  substantial  efficiencies  could  be  obtained 
if  the  enumeration  of  all  possible  interleavings  could  be  avoided.  .\s 
a  result,  several  have  proposed  verification  algorithms  based  on  par¬ 
tial  orders  [V'alSO.  V'al90.  God90.  GVV9L  PLS9.  PL90.  YTK91].  The 
method  presented  here  is  based  on  unfolding  a  Petri  net  into  an  acyclic 
structure  called  an  occunmre  iirf.  The  notion  of  unfolding  was  intro¬ 
duced  by  .\ielsen.  Plotkin  and  VVinskel  as  a  means  for  giving  a  con¬ 
current  semantics  to  nets,  but  in  this  case  the  goal  is  to  avoid  the 
state  explosion  problem.  An  algorithm  is  introduced  for  constructing 
the  unfolding  of  a  net.  which  terminates  when  the  unfolded  net  rep¬ 
resents  all  of  the  reachable  states  of  the  original  net.  The  unfolding 
is  therefore  adequate  for  testing  reachability  (to  be  more  precise,  roc- 
rrabilitij)  and  deadlock  properties.  It  is  shown  using  an  asynchronous 
circuit  example  that  the  unfolding  can  be  polynomial  in  the  circuit 
size  while  the  state  space  is  exponential.  In  contrast,  the  stubborn 
sets  method  of  Valmari  [Val89.  Val90)  and  trace  automaton  method  of 
Codefroid  [CIod90.  (lVV9lj  are  ineffective  in  reducing  the  state  explo¬ 
sion  problem  for  asynchronous  circuit  models,  because  of  the  ubiciuily 
of  confusion  in  such  models.  In  addition,  becau.se  the  unfolding  method 
is  fully  automatic,  it  has  a  certain  advantage  over  l)ehavior  machines 
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method  of  Probst  [PL89.  PL90],  which  requires  a  pomset  grammar  de¬ 
scribing  the  circuit’s  behavior  to  be  constructed  l)y  hand. 


6.1  The  unfolding  operation 

Briefly,  an  occurrence  net  is  a  Petri  net.  without  backward  conflict  (two 
transitions  outputting  to  the  same  place),  and  without  cycles.  .Such 
a  net  can  be  obtained  from  an  ordinary  |)lace/tran.sition  net  l)y  an 
unfolding  process.  Figure  6.1  shows  an  e-xample  of  a  net  and  part  of 
its  unfolding.  Since  the  occurrence  net  it  is  acyclic  and  rooted,  there  is 
a  natural  well  founded  (partial)  order  on  the  transitions  and  places  of 
the  net.  This  order  is  called  the  dependency  order.  It  is  impossible  for 
a  transition  of  the  occurrence  net  to  fire  unless  all  of  its  predecessors 
in  the  dependency  order  have  fired. 

The  most  important  theoretical  notion  regarding  occurrence  nets  is 
that  of  a  configuration.  configuration  rc'presents  a  possible  partial 
run  of  the  net  -  it  is  any  set  of  tratisitions  that  satisfies  the  following 
conditions; 

1.  If  any  transition  is  in  the  configuration,  then  so  are  all  of  its  pre¬ 
decessors  in  the  dependency  order  (a  configuration  is  downward 
closed) . 

2.  configuration  cannot  contain  two  transitions  in  cnnflivl.  mean¬ 
ing  that  both  input  from  the  same  place. 

An  example  of  a  configuration  is  shown  in  figure  6.2.  with  element- 
of  the  configuration  filled  in  black.  Two  transitions  in  the  figure'  are 
hatched  in.  Either  of  these  transitions  can  be  added  to  the  black  set  tee 
form  a  new  configuration.  .Adding  any  other  transition  would  be  illegal, 
however,  since  it  would  either  violate  downward  closure  or  conflict - 
freeness. 

In  an  unfolding,  each  transition  corresponds  to  a  transition  of  the 
original  net.  and  each  place  corresponds  to  a  place  of  the  original  net. 
VVe  can  associate  each  configuration  of  the  unfolding  with  a  state  i  mark¬ 
ing)  of  the  original  net  by  simply  identifyitig  those  j)laces  wliose  toketis 
are  produced  but  not  consumed  by  the  tratisitions  in  the  configuration. 
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3)  Petri  net  b}  Unfolded  to  occurrence  net 

Figure  ti.l:  I  nfolding  example. 
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This  set  is  marked  with  black  dots  in  figure  6.2.  Mapping  this  set  back 
onto  the  original  net,  we  obtain  the  final  state  of  the  configuration. 

The  final  theoretical  notion  we  need  regarding  unfoldings  is  that 
of  a  local  configuration.  The  local  configuration  associated  with  any 
transition  consists  of  that  transition  and  all  of  its  predecessors  in  the 
dependency  order  (that  is,  the  downward  closure  of  the  transition  as 
a  singleton).  This  is  the  set  of  transitions  which  necessarily  are  con¬ 
tained  in  any  configuration  containing  the  given  transition.  Note  that 
a  local  configuration  may  not  exist  if  this  set  contains  two  transitions 
in  conflict. 

We  are  now  ready  to  consider  the  problem  of  building  a  fragment 
of  the  unfolding  which  is  large  enough  to  represent  all  of  the  reachable 
markings  of  the  original  net.  Building  the  unfolding  itself  is  straight¬ 
forward.  The  process  starts  vvitli  a  set  of  places  corresponding  to  the 
initial  marking  of  the  original  net.  I'he  unfolding  is  grown  by  finding  a 
set  of  places  in  the  unfolding  which  correspond  to  the  inputs  (preset) 
of  a  transition  in  the  original  net.  then  adding  a  new  instance  of  that 
transition  to  the  unfolding,  as  well  as  a  new  set  of  places  correspond¬ 
ing  to  its  outputs  (postset).  If  the  new  transition  has  no  conflicts  in 
its  local  configuration  (more  precisely,  if  it  has  a  local  configuration) 
it  is  kept,  otherwise  it  is  discarded.  This  is  because  the  existence  of  a 
conflict  means  that  the  new  transition  can  occur  in  no  configurations 
of  the  unfolding. 

The  key  to  termination  of  the  unfolding  is  to  identify  a  set  of  tran¬ 
sitions  of  the  unfolding  to  act  as  cutoff  points.  This  set  must  have  the 
following  property:  any  configuration  containing  a  cutoff  point  must 
be  equivalent  (in  terms  of  final  state)  to  some  configuration  containing 
no  cutoff  points.  From  this  definition,  it  follows  that  any  successor  of 
a  cutoff  point  can  be  safely  omitted  from  the  unfolding,  without  sacri¬ 
ficing  any  reachable  markings  of  the  original  net.  To  see  this,  suppose 
we  have  built  the  unfolding  only  up  to  the  cutoff  points,  in  the  sense 
that  any  new  transition  we  can  add  must  have  a  cutoff  point  as  a  pre¬ 
decessor.  From  this  point  on.  any  transition  we  add  must  be  descended 
from  some  cutoff  point.  Thus,  any  configuration  we  might  add  to  the 
unfolding  must  have  the  same  final  state  as  some  configuration  already 
present. 

•A.  sufficient  condition  for  a  transition  to  be  a  cutoff  point  is  the 
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following:  the  final  state  of  its  local  configuration  is  the  same  as  that  of 
some  other  transition  whose  local  configuration  is  smaller.  The  proof 
of  this  statement  is  as  follows:  suppose  there  are  two  transitions  ti  and 
<2i  whose  local  configurations  have  the  same  final  state,  with  that  of 
<2  being  smaller.  Now  imagine  a  configuration  Ci  (local  or  otherwise) 
containing  ti.  We  can  obtain  Ci  from  the  local  configuration  of  ti  by 
adding  the  transitions  in  the  difference  one  at  a  time,  in  an  order  con¬ 
sistent  with  the  dependency  relation.  According  to  our  construction,  al 
each  step  of  this  process,  there  is  a  corresponding  transition  we  can  add 
to  the  local  configuration  of  <2  leading  to  the  same  final  state.  Hence, 
we  can  build  a  configuration  C2  containing  <2  vvhich  has  the  same  final 
state,  but  is  at  least  one  transition  smaller  than  C’l.  since  we  started 
from  a  smaller  set.  Thus  if  any  configuration  contains  a  cutoff  point, 
it  is  equivalent  to  a  smaller  configuration.  (Configurations  cannot  be 
made  arbitrarily  small,  however,  so  any  configuration  containing  a  cut¬ 
off  point  must  be  equivalent  to  a  configuration  not  containing  a  cutoff 
point.  Since  all  the  reachable  states  are  represented  by  configurations 
containing  no  cutoff  points,  it  is  unnecessary  to  build  the  unfolding 
beyond  any  cutoff  point. 

We  can  find  the  cutoff  points  by  simply  keeping  a  hash  table  of  all 
transitions,  indexed  by  the  final  stale  of  the  local  configuration.  If  when 
generating  a  transition,  we  find  in  the  table  a  transition  with  equivalent 
but  smaller  local  configuration,  we  discard  the  new  transition.  We  can 
show,  as  follows,  that  this  process  is  guaranteed  to  terminate  if  the 
original  net  is  bounded  and  finite.  First,  the  depth  of  the  unfolding 
must  be  bounded  by  the  numljer  of  number  of  reachable  markings. 
The  depth  of  a  given  transition  in  the  unfolding  is  the  longest  chain  of 
predecessors  of  that  transition.  Each  transition  in  this  chain  has  a  local 
configuration,  and  these  local  configurations  form  a  chain  of  increasing 
size.  If  the  depth  of  the  given  transition  is  greater  than  the  number 
of  reachable  markings  of  the  original  net.  then  by  the  pidgeon-hole 
principle,  two  of  these  local  configurations  must  have  the  same  final 
state.  This  cannot  be.  however,  since  in  this  case  one  of  the  transition.'^ 
in  the  chain  would  have  been  determined  to  l)e  a  cutolf  point.  If  the 
original  net  is  bounded,  it  has  a  finite  number  of  reachable  markings, 
hence  the  depth  of  the  unfolding  is  bonnchvl.  If  the  original  net  is  fiiiiit'. 
we  can  show  by  induction  that  the  numix'r  of  transitions  at  any  niven 
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Figure  6. -I:  Dining  pliilosopliers  net. 

depth  in  the  unfolding  is  finite.  Hence  the  total  number  of  transitions 
generated  by  the  unfolding  process  is  finite. 

.As  an  e.xample  of  termination,  consider  the  net  of  figure  6.3.  which 
represents  the  dining  philosophers  paradigm.  In  this  scenario,  there 
are  n  concurrent  processes  l  pliilosophers).  each  of  which  must  acquire 
the  use  of  two  shared  resources  (forks)  in  order  to  e.xecute  its  critical 
section  (eating  spaghetti).  The  proces.ses  are  organizeil  in  a  ring,  with 
each  neighboring  pair  sharing  one  resource.  Figure  6.  I  shows  the  com¬ 
pleted  unfolding  for  the  ca.se  of  three  philosophers  (u  =  3),  The  cutoff 
points  are  marked  with  an  Tlie  local  configuration  of  each  of  these 
transitions  is  equivalent  to  the  empty  configuration.  We  ob, serve  that 
the  size  of  the  unfolding  is  [lot  only  bounded.  I)ut  is  linear  in  the  num¬ 
ber  of  philosophers,  while  the  number  of  states  is  exponential  as  shown 
in  table  6.1. 

Recall  that  in  growing  the  unfolding,  it  is  necessary  to  enumerate  all 
ot  the  subsets  of  places  which  correspond  to  the  inputs  of  transitions. 
The  complexity  of  this  is  where  n  is  the  size  of  the  unfolding,  and 

/  is  the  largest  numl)er  of  inputs  of  any  transition.  This  is.of  course, 
bounded  by  n‘.  which  is  polynomial  given  a  fixed  value  of In  practice. 
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however,  the  nuniber  ot  subsets  which  are  considered  can  be  reduced 
quite  efFecti'^eiy,  using  the  lollowing  two  techniques.  First,  suppose  we 
are  enumerating  the  subsets:  we  need  not  add  any  place  to  the  set  if 
the  result  would  not  be  contained  in  the  set  of  inputs  of  any  transi¬ 
tion.  Second  whenever  a  place  is  added  to  the  set.  we  can  immediately 
eliminate  from  consideration  all  of  the  places  which  have  a  predecessor 
in  conflict  with  a  predecessor  i)f  the  new  element,  since  any  transition 
with  both  places  as  inputs  would  l)e  discarded.  VVe  add  transitions  to 
rhe  net  in  order  increasing  size  of  the  local  configuration,  so  that  we 
can  use  a  hash  table  to  determine  whether  or  not  each  transition  is  a 
cutoff  point.  Thus,  whenever  a  candidate  for  a  transition  in  the  un¬ 
folding  is  generated,  it  is  placed  in  a  queue  ordered  by  increasing  local 
configuration  size.  The  places  of  the  net  are  enumerated  by  pulling  the 
first  element  t'  from  this  c|ueue.  testing  whether  it  is  a  cutoff  point, 
and  if  not,  generating  places  for  its  outputs.  The  procedure  terminates 
when  the  queue  of  candidate  transitions  becomes  empty.  Figures  b.*) 
and  6.6  show  a  psettdo-<orle  implementation  ot  this  procedure.  The 
pseudo-code  is  written  somewhat  inefficiently  in  places  for  simplicity. 

In  function  Cnfold.  the  argttments  P.  T  and  .l/o  are  the  places, 
transitions  and  itiilial  tnarkinu  of  the  original  net.  Each  place  in  the 
unfolding  is  represented  by  a  pair  i  place,  preds).  where  place  is  the  cor- 
respontling  place  in  the  original  net.  and  pre<b  is  the  set  of  immediate 
predecessor  transitions  m  the  \u;folding  (note  that  since  there  is  no 
itackwaixl  conflict,  the  size  of  rliis  set  is  at  most  one).  Each  transition 
in  the  unfolding  is  represenle<l  bv  a  pair  ( trans.  pred.'i).  where  tran.'^  is 
Tie  corresponding  transition  in  tlu’  original  net.  ami  pred.'i  is  the  sot  of 
immediate  predeces.sor  pbu'es  m  tlie  unfolding.  The  function  ri'tnrns  P' 
and  T' .  the  set  ot  places  and  transitions,  respectively,  of  the  utifoldinii. 
1  here  is  also  a  queue  ()'  of  transitions  to  be  expanded,  and  a  hash  tabh' 
(Hash Table)  used  for  idenlifvin!>  cutoff  pefints. 

t 'ov'erability  problems  can  be  solved  using  the  unfolding  in  the  fol¬ 
lowing  way.  Imagine  we  liave  a  sot  of  places  in  the  original  net.  and 
we  wish  to  determined  whether  this  s«*t  can  everv  be  simultaneoiislv 
ttiarked.  We  sim[)ly  add  a  new  transition  lo  the  net,  whose  inriuts 
are  rhe  given  set.  atid  tluui  construct  tlu’  unfohling.  II  the’  unfolditur 
' ontains  anv  instance  ot  this  new  transition,  the  s<‘i  is  <'overable.  and 
ot  herwise  not. 
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global  P',r,Q',HashTable[] 
function  Unfold(  P,T,A/o) 

P'  •=  T'  =  Q'  =  0;  clear  HashTable 
for  each  p  £  Mq  do 
add  p'  =  (p,0)  to  P' 

GenTrans(  {/>'}.  T) 
end  for 

while  the  queue  Q'  is  not  empty  do 
pull  the  first  t'  off  of  Q' 
if  not  IsCutofFPoint?(<')  do 

for  each  p  in  outputs  of  transit')  do 
add  p'  =  (p,  {<'} )  to  P' 

GenTrans({p'}.7’) 
end  for 
end  if 
end  while 
returnf  P'.T') 
end  function 

procedure  GenTransf  5'.  T") 

if  not  exists  t  ^  T  such  that  pluc({S']  C  inputs  of  t  tlien  return 
if  Predecessors! 5')  has  forward  conflict  then  return 
forall  t  £T  do  if  placeiS')  —  inputs  of  /  then 
add  /'  =  {t.S')  to  set  T' 
insert  t'  in  Q'  in  order  of  i LocaK  onligf  f' f! 
end  for 

for  all  //  €  P  where  p'  older  than  any  member  of  S'  do 
GenTransf  S'  U  p'.T) 
end  procedure 


Figurp  b.o;  PsPiido-co<ie  iniplpiiiPiitalioii  of  nnlolding  proci'diire 
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function  IsCutoffPoint?(t'i ) 

C;  =  LocalConfig(«'i) 

S\  =  FinalState(C'i) 

L'  =  HashTable(HashFun(  5'i )] 
forall  to  in  L'  do 

Co  =  LocalConfigi/^) 

if  S[  =  FinalState(C2)  and  SizefC^)  <  Size(CJ)  then  return(l) 
end  for 

add  t\  to  HashTable[H£UihFun(5J )] 
return!  0) 
end  function 

function  LocalConfig(  ) 

return!  Predecessors!  {t'} )  fl  T') 
end 

function  Predecessors! 5') 
do 

S'  =  S'  U  preds!  5') 
until  S'  unchanged 
end  function 

function  FinalState!  C") 

let  S'  be  the  set  of  all  p'  €  P'  such  that  preds! p’)  C  C' 
return!  place!  5'  -  preds! C'))) 

Pnd  function 


Figure  6.6;  Pseudo-code,  continued. 
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Figure  6.7:  Translation  from  circuit  to  net 


6.2  Application  example 

VVe  now  consider  a  more  realistic  e.xample  than  the  dining  philosophers 
-  a  speed- independent  [SeiSOb]  circuit  designed  to  implement  a  dis¬ 
tributed  mutual  exclusion  (DME)  ])rotocol.  The  circuit  was  designed 
by  .Alain  .Vlartiu  (MarSoj  and  has  been  analyzed  using  an  abstracted 
trace  theoretic  model  by  Dill  (Dil88j.  It  was  also  used  as  an  example 
for  the  symbolic  model  checking  method  in  section  2.1.2. 

Networks  of  logic  gates  in  speed-iiidei)endent  circuits  are  readily 
modeled  by  Petri  nets.  .A  network  of  u  gates  can  be  modeled  by  a  Petri 
net  of  0(n)  places.  When  we  model  a  network  of  gates  as  a  Petri  net. 
we  introduce  two  places  for  each  input  of  each  gate.  One  represents  the 
the  input  in  a  logic  low  state,  while  the  other  represents  the  input  in 
a  logic  high  state.  Transitions  in  the  Petri  net  correspond  to  rising  or 
falling  transitions  of  gate  outputs.  .\  risinii  transition  ol  a  gate  output 
removes  all  the  logic  low  tokens  from  the  inputs  to  which  it  is  connected, 
and  places  tokens  on  the  correspondin!>  logic-high  places. 

.As  an  example,  figure  6.7  shows  the  net  fragment  representing  an 
.AND  gate.  When  both  inputs  of  the  gate’  are  at  the  logic  high  state,  we 
can  move  a  token  from  the  place  representing  logic  low  at  the  output 
to  the  place  representing  logic  hittii.  .Similarly,  if  either  input  is  at  the 
logic  low  state,  we  can  move  a  token  from  the  place  representing  loaie 
high  at  the  output  to  the  place  represent  in»  logic  low. 

.A  dynamic  hazarrl  occurs,  for  example,  if  the  .\.\D  gate's  output  is 
enabled  to  rise  while  one  of  the  inputs  is  enabled  to  fall.  The  probh'ui 
of  whether  or  not  a  dynamic  hazard  can  occur  can  thus  be  posed  as  a 
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coverability  problem.  Alternatively,  since  dynamic  h<izards  correspond 
to  dynamic  conflicts  in  the  unfolding,  the  problem  can  be  solved  by 
constructing  the  unfolding  and  examining  it  for  dynamic  conflicts,  ie.. 
two  transitions  which  are  in  conflict,  and  which  may  be  simultaneously 
enabled.  The  DME  circuit  also  uses  special  two-way  mutual  exclusion 
elements  as  components,  which  are  immune  to  certain  hazards.  In 
checking  the  DME  ring  for  hazards,  we  ignore  conflicts  between  rising 
transitions  of  a  mutual  exclusion  element’s  acknowledge  outputs. 

Table  6.2  shows  the  results  of  the  occurrence  net  unfolding  proce¬ 
dure  (ONU),  and  a  depth-first  traversal  of  the  state  space  (DFT),  for 
the  Petri  net  model  of  the  circuit,  for  rings  with  one  to  five  ceils.  The 
depth  of  the  occurrence  net  unfolding  for  the  case  of  5  cells  was  141 
transitions.  The  number  of  transitions  in  the  ONU  increcises  quadrati- 
cally  in  the  number  of  cells.  This  is  l)ecause  as  the  number  of  cells  in 
the  ring  increases,  a  request  must  l)e  relayed  through  a  greater  number 
of  stages  in  order  to  obtain  the  token,  in  the  worst  case.  .At  the  same 
time,  the  number  of  cells  which  are  requesting  also  increases.  The  oc¬ 
currence  net  therefore  grows  in  both  width  and  depth  in  proportion  to 
the  number  of  cells.  .As  we  increase  the  number  of  cells  in  the  ring, 
the  number  of  reachable  global  markings  increases  exponentially.  For 
this  reason,  it  was  only  possil)le  to  apply  DFT  to  a  system  of  five  cells, 
before  the  available  memory  resources  were  exhausted.  It  is  known, 
however,  from  using  OBDD  based  methods,  that  the  number  of  states 
increases  asymptotically  by  slightly  less  than  a  factor  ten  for  each  added 
cell. 

How  do  these  results  compare  other  methods  for  avoiding  the  state 
explosion  problem.'  The  trace  theory  approach  of  Dill  [DilSS]  retpiired 
an  abstract  model  of  the  arbiter  cell  to  be  created  by  hand.  This  reduces 
the  state  explosion  problem,  but  does  not  entirely  solve  it.  since  even 
with  the  reduced  model,  the  number  of  states  still  increases  exponen¬ 
tially  with  the  number  of  components.  Probst  [PL90]  reports  a  method 
which  requires  quadratic  space  and  time  in  the  number  of  cells,  but 
also  is  not  fully  automatic.  The  methods  of  Valmari  [ValSD.  V'al90]  and 
(iodefroid  iCod90.  GW9i)  and  Yoneda  [YTK91j  cannot  be  effectively 
applied  to  this  example  or  to  other  spee*l  independent  circuits,  because 
in  all  states,  all  enabled  transitions  are  in  conflict  with  some  disabled 
transition.  Thus  no  transition  <  an  ite  statically  guaranteed  to  be  per- 
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Figure  6.8:  Distributed  mutual  exrlusion  circuit 

sistent.  Experiments  by  Holger  SchiinglofT'  liave  confirmed  this  to  be 
the  case.  It  is  possible,  perhaps,  that  some  more  clever  static  aiialvsis 
technique  could  be  used  to  show’  that  some  transitions  are  persistent, 
in  which  case  these  methods  could  be  applierl  to  some  effect. 

Finally,  in  chapter  2.  we  saw  that  the  symbolic  model  checkinu 
method  had  cubic  time  complexity  an<l  linear  space  complexity,  usinii 
a  simultaneous  model.  Burch  atid  Long-  have  obtained  0(n-  M  time 
complexity  for  this  circuit  using  symbolic  motlel  checking  with  a  mo«li 
fied  search  order  (cf.  section  2.8).  This  methofl  requires  some  hand  o|)- 
timization.  however.  In  any  event,  it  appears  that  the  symbolic  model 
checking  method  yields  somewhat  better  asymptotic  performance  h)i 
the  DME  circuit,  though  both  methods  elfenively  solve  the  stat('  ex¬ 
plosion  problem. 

^  Personal  communication 

"Personal  communication 
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number 
of  cells 

size  of  unfolding 
(transitions) 

number  of 
reachable  states 

1 

23 

22 

2 

125 

502 

3 

313 

6579 

4 

604 

75172 

5 

1013 

802425 

Table  6.2:  Performance  of  ONU  and  DFT  on  hazard-detection  problem 
for  the  distributed  mutual  exclusion  circuit 

6.3  Deadlock  and  occurrence  nets 

Besides  coverability,  another  interesting  problem  for  Petri  nets  is  the 
question  of  deadlock.  .A  terminal  marking  of  a  Petri  net  is  one  in 
which  no  transitions  are  enabled.  Reachability  of  a  terminal  (or  dead¬ 
locked)  state  cannot  be  framed  in  terms  of  the  coverability  problem. 
However,  since  the  unfolding  represents  all  reachable  markings,  a  net 
has  a  reachable  terminal  marking  if  and  only  if  its  unfolding  has  a 
reachable  terminal  marking.  The  problem  of  existence  of  a  reachable 
terminal  marking  of  an  occurrence  net  is  NP-complete.  This  is  eas¬ 
ily  shown  by  reduction  from  3-SAT.^  To  see  this  consider  the  formula 
(•fi  +  !J\  +  ~i)(-C2  +  1/2  +  -2)  •  •  •  (fn  +  !jn  +  -n)  where  each  x„  y,  and  r,  is 
a  positive  or  negative  literal.  .Assume  the  formula  has  m  variables.  Let 

the  positive  literals  be  /( . and  the  negative  literals  be  /i . /m- 

In  polynomial  time,  we  can  construct  a  net  which  has  a  terminal  mark¬ 
ing  if  and  only  if  the  formula  is  satisfiable.  The  initial  marking  of  the 

net  is  a  set  of  places  (t’l . r,n}.  There  is  a  place  representing  each 

positive  literal  L, . . .  Jm  and  each  negative  literal  U, ...  .In-  For  each 
variable  u,,  there  is  a  transition  from  e.  to  /,  and  from  v,  to  /,.  For 
each  conjunct  (x,  +  y,  -|-  r,),  there  is  a  transition  c,,  whose  preset  is 
c,}.  In  other  words,  the  transition  c,  is  enabled  to  fire  if  and 
only  if  (x,  -|-  y,  -t-  x, )  is  false.  Thus,  some  transition  c,  is  enabled  to  fire 
if  and  only  if  the  whole  formula  i.s  false.  The  postset  of  each  transition 

'^Satisfiability  of  a  Boolean  forrrula  in  conjunctive  normal  form,  with  three  lit¬ 
erals  in  each  conjunct. 
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Figure  6.9;  Reduction  from  :3-SAT  problem  to  a  terminal  marking  prob¬ 
lem. 


c,  is  the  single  place  and  there  is  a  transition  from  {(/}  to  {</}. 
Thus,  if  any  c,  fires,  the  net  may  never  reach  a  terminal  markimi.  .\s  a 
result,  there  is  a  terminal  marking  of  the  net  if  and  only  if  the  formula 
is  satisfiable.  For  e.xample.  figure  6.0  shows  the  ikU  constructed  for  the 
formula  (a  +  h  +  c){h  +  c  +  d). 

The  reader  may  ecisily  verify  that  the  size  ot  the  unfolding  of  such 
a  net  (up  to  the  cutoff  points)  is  linear  in  the  size  of  the  original  lu't . 
In  fact,  it  is  essentially  the  same  net.  except  the  the  place  </  occurs  n 
times  in  the  unfolding.  Since  all  reachable  markings  of  the  original  net 
occur  as  configurations  of  the  unfolding,  the  unfolding  has  a  terminal 
marking  if  and  only  if  the  formula  is  satisfialth'.  Hence  3-S.-\T  is  P-time 
reducible  to  reachability  of  a  terminal  marking  of  an  unfoldinsr.  Since 
the  configuration  representing  the  terminal  marking  can  be  guessed  in 
P-time  in  the  size  of  the  unfolding,  and  <ilso  tested  in  P-time.  it  follows 
that  the  problem  is  in  NP.  and  hence  NP-coinplett'. 

Interestingly,  however,  the  problem  is  readilv  solved  in  practice  even 
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1  let  B  be  the  set  of  the  cutoff  points,  T,  =  0 

2  while  B  is  not  empty  do 

3  let  t  the  the  element  of  B  with  the  fewest  spoilers 

4  if  t  has  no  spoilers,  then  backtrack 

5  choose  an  element  t'  from  the  spoilers  of  t 

6  add  t'  to  T, 

7  delete  all  transitions  in  conflict  with 
S  end  do 


Figure  6.10:  Procedure  to  detect  terminal  marking. 

for  very  large  unfoldings,  using  an  algorithm  based  on  techniques  of 
constraint  satisfaction  search.  The  key  observation  which  leads  to  this 
algorithm  is  that  there  is  no  terminal  marking  exactly  when  all  config¬ 
urations  the  unfolding  can  reach  some  configuration  containing  a  cutoff 
point.  This  is  simply  because  if  there  is  no  terminal  marking,  then  all 
configurations  can  reach  a  configuration  which  is  arbitrarily  large.  .\ 
configuration  C'  can  reach  a  configuration  containing  transition  t'  if 
and  only  if  the  union  of  C  and  the  local  configuration  of  t’  is  a  config¬ 
uration.  If  it  is  not.  then  no  set  containing  C  and  t'  is  a  configuration. 
If  the  union  is  not  a  configuration,  we  will  say  that  C'  and  t'  are  in 
conflict.  Hence,  there  is  a  terminal  marking  if  and  only  if  there  is  a 
configuration  which  is  in  conflict  with  every  cutoff  point.  The  search 
for  such  a  configuration  can  be  carried  out  using  branch  and  bound 
techniques.  For  example,  if  a  configuration  C  is  in  conflict  with  a  cut¬ 
off  point  t' .  there  must  be  a  transition  t\  €  C  which  is  in  conflict  with 
some  transition  in  the  local  configuration  of  t' .  Such  a  transition  t\  will 
be  called  a  spoiler  of  t' . 

There  exists  a  configuration  in  conflict  with  all  of  the  all  of  the 
cutoff  points  (equivalently,  there  exists  a  terminal  marking)  if  and  only 
if  there  exists  a  configuration  containing  a  spoiler  for  every  cutoff  point. 
The  set  of  spoilers  contained  in  this  configuration  will  be  called  7,.  The 
algorithm  of  figure  6.10  uses  branch  and  bound  techniques  to  find  such 
a  set  T,  if  one  exists. 
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Note  that  in  line  3  of  the  procedure,  the  cutoff  point  with  the  small¬ 
est  number  of  spoilers  is  chosen  so  that  the  number  of  choices  in  line  5 
is  minimized.  Whenever  a  spoiler  for  a  given  cutoff  point  is  chosen  to 
belong  to  Tj  in  line  5,  everything  in  conflict  with  T,  is  eliminated  from 
future  consideration  in  line  7.  Note  that  the  cutoff  points  in  conflict 
with  T,  are  also  eliminated,  which  cuts  down  on  the  amount  of  future 
branching.  Whenever  there  is  a  cutoff  point  with  no  remaining  spoilers, 
the  procedure  backtracks,  from  line  4  to  the  most  recent  occurrence  of 
line  5  where  there  are  remaining  choices.  If  there  are  no  remaining 
choices,  the  procedure  fails.  Of  course,  when  backtracking  occurs,  the 
the  net  is  also  returned  to  the  state  it  was  in  at  the  point  where  e.\- 
ecution  is  being  resumed.  This  backtracking  is  easily  implemented  In- 
keeping  a  stack  of  the  remaining  choices  for  /'  in  each  iteration  of  the 
loop,  and  marking  each  transition  in  the  net  with  the  level  of  the  stack 
at  the  time  it  was  “removed”.  Interestingly,  if  the  procedure  terminates 
successfully,  the  remaining  net  has  the  property  that  every  path  leads  to 
a  terminal  marking  of  the  original  net  .\’.  This  makes  it  straightforward 
to  extract  a  path  leading  to  a  terminal  marking. 

Obviously,  because  of  the  backtracking,  this  procedure  is  exponen¬ 
tial  (as  it  must  be.  if  "P  7=  XV).  However,  this  is  only  the  worst  case. 
The  dining  philosophers  serve  as  an  example  of  a  case  in  which  the 
exponential  complexity  is  avoided.  In  fact,  the  procedure  finds  the  ter¬ 
minal  marking  in  time  which  is  linear  in  the  number  of  philosophers. 
This  is  easily  seen  by  examining  the  unfolding  of  the  Dining  Philoso¬ 
phers  net  in  figure  6.4.  There  is  one  cutoff  point  in  this  net  for  each 
process.  Initially,  each  of  these  transitions  has  two  spoilers,  which  cor¬ 
respond  to  the  two  resources  rec|uire<l  to  enter  the  critical  region  beinii 
granted  to  the  two  neighboring  proces.ses.  Regardless  of  which  cutoff 
point  is  used  first,  the  symmetry  is  then  Inoken  as  the  part  of  the  net 
in  conflict  with  one  of  the  two  spoilers  is  removed.  Tliis  removes,  in 
particular,  the  transition  which  grante<l  oiu*  of  the  resources  to  the  first 
philosopher,  hence  one  of  its  neighiiors  now  has  only  one  spoiler,  so 
there  is  only  one  choice  available  the  next  time  line  0  is  reached.  .\f- 
ler  this  spoiler  is  added  to  T,.  the  remaining  neighbor  of  the  second 
philosopher  now  has  only  one  spoiler.  This  process  continues  withoui 
backtracking  until  it  has  come  full  circle  aiul  the  terminal  marking  is 
found.  Note  that  if  the  cutoff  point  witli  the  fewest  spoilers  were  not 


6.4.  RELATION  TO  AI  TECHNIQUES 


189 


chosen  in  line  3,  the  procedure  might  have  examined  an  exponential 
number  of  candidates  for  T,  before  a  valid  one  was  found. 

In  fact,  using  nets  representing  communication  protocols  ais  exam¬ 
ples,  this  procedure  hcis  been  successfully  been  applied  to  unfoldings 
with  more  than  3000  transitions  and  1000  cutoff  points,  where  some 
cutoff  points  had  as  many  as  50  spoilers.  It  is  clear  that  the  branch 
and  bound  technique  quickly  narrows  down  the  number  of  choices  in 
these  examples. 


6.4  Relation  to  AI  techniques 

The  occurrence  net  unfolding  method,  as  applied  to  the  coverability 
problem,  was  inspired  by  so-called  “least  commitment”  strategies  for 
.AlI  planning  problems,  especially  nonlinear  planning  techniques.  Like 
these  strategies,  the  method  falls  into  the  category  of  searching  in  the 
space  of  solutions  (ie..  covering  sequences  for  a  given  set),  rather  than 
the  space  of  the  problem  (ie.,  the  reachable  markings).  Each  partially 
constructed  unfolding  represents  some  set  of  possible  partial  solutions 
which  may  be  extended  to  a  complete  solution.  As  in  other  constraint 
satisfaction  search  methods,  the  method  tries  to  eliminate  as  early  as 
possible  those  partial  solutions  which  cannot  be  extended  to  complete 
solutions.  This  is  done  in  the  unfolding  procedure  by  the  elimination 
of  candidate  transitions  which  have  no  local  configuration,  and  also 
by  the  cutoff  points,  which  effectively  discard  those  partial  solutions 
which  cannot  be  extended  to  a  lowest  cost  (ie..  fewest  transition)  so¬ 
lution.  This  is  done  without  unnecessarily  committing  to  the  order  of 
independent  transitions.  This  makes  the  unfolding  method  somewhat 
similar  to  constraint  posting  methods  used  in  non-linear  planning.  Both 
methods  construct  fairly  similar  structures,  although  non-linear  plan¬ 
ners.  such  as  NOAH  [Sac77|  only  represent  one  partial  solution,  while 
the  occurrence  net  represents  all  partial  solutions.  Non-linear  planners 
attempt  to  detect  conflicts  and  eliminate  them  by  posting  additional 
constraints  on  the  solution  or  by  modifying  elements  of  the  solution. 
Non-linear  planners  also  use  heuristics  to  guide  them  towards  a  solu¬ 
tion.  and  hence  sometimes  overconstrain  the  solution  space  and  require 
backtracking.  For  this  reason,  they  are  heuristically  efficient,  but  would 
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not  be  suitable  for  exhausting  the  solution  space,  as  is  required  in  au¬ 
tomatic  verification  methods.  In  fact,  NOAH  is  not  even  guaranteed  to 
find  a  solution  where  one  exists.  A  later  system  called  TWEAK  [Cha87] 
does  guarantee  a  solution  where  one  exists,  but  fails  to  terminate  if  no 
solution  exists.  Since  TWEAK  does  overconstrain  the  solution  space, 
exhaustive  search  could  only  be  achieved  by  backtracking,  even  if  a  suit¬ 
able  termination  condition  were  found.  The  unfolding  procedure  never 
overconstrains  the  solution  space,  however,  as  conflicting  transitions 
may  coexist  in  the  unfolding.  Hence,  it  does  not  backtrack. 


6.5  Evaluation 

When  is  unfolding  a  suitable  strategy  I'or  problems  in  automatic  veri¬ 
fication?  The  most  promising  application  is  hazard  checking  for  asyn¬ 
chronous  control  circuits.  In  these  circuits,  the  state  explosion  seems  to 
derive  almost  entirely  from  arbitrary  interleavings  of  concurrent  tran¬ 
sitions.  In  such  cases,  the  unfolding  method  can  have  a  considerable 
advantage  over  methods  that  search  the  entire  state  space.  .Note,  how¬ 
ever.  that  other  methods  based  on  partial  orders  are  not  necessarily 
effective  in  reducing  the  state  explosion  for  these  circuits,  because  of 
the  aforementioned  problem  of  determining  when  transitions  of  the  net 
are  persistent. 

In  general,  any  problem  which  can  be  |)osed  in  terms  of  coverability 
or  deadlock  in  a  Petri  net  model  is  a  possible  application  of  the  un¬ 
folding  method.  In  addition,  it  is  possible  that  heuristically  efficienit 
procedures  can  be  found  for  deciding  the  existence  of  an  infinite  firing 
path  in  some  a;-regular  set.  given  an  unfolding.  In  this  case,  specifi¬ 
cations  framed  as  linear  time  temporal  logic  formulas,  or  vo-automata 
could  be  evaluated. 


Chapter  7 
Conclusion 


What  we  have  seen  in  the  preceding  chapters  is  that  Ordered  Binary 
Decision  Diagrams  can  be  used  as  a  representation  in  a  wide  variety 
of  automatic  verification  algorithms,  in  order  to  cope  with  the  state 
explosion  problem.  This  can  be  done  in  a  unified  way  by  represent¬ 
ing  the  algorithms  in  the  Mu-Calculus  fixed  point  notation.  For  fairly 
diverse  families  of  regularly  structured  systems,  the  CTL  model  check¬ 
ing  algorithm  was  observed  to  run  in  time  and  space  which  increased 
polynomially  in  the  size  of  the  system,  while  the  number  of  reachable 
states  increcised  exponentially.  These  results  bear  out  a  theoretical  re¬ 
sult  bounding  the  OBDD  representation  of  the  transition  relation  for 
such  systems.  Standard  automatic  verification  algorithms  would  be  un¬ 
suitable  for  these  examples  because  their  complexity  is  proportional  to 
the  number  of  reachable  states. 

Using  OBDD  bcised  techniques,  and  a  language  suitable  for  the  ab¬ 
stract  modeling  of  digital  systems,  it  was  possible  to  verify  a  fairly 
complex  industrial  design  for  a  cache  consistency  protocol,  finding  a 
number  of  subtle  errors  in  the  process.  The  verification  process  is  valu¬ 
able  not  only  because  of  the  advantages  of  formalization  and  exhaustive 
checking,  but  because  it  can  find  protocol  errors  more  quickly  than  sim¬ 
ulation.  despite  the  exponential  growth  in  states  as  the  model  increases 
in  size.  The  ability  to  i.solate  high  level  errors  quickly  shortens  the  loop 
between  design  and  verification,  making  it  possible  to  experiment  more 
freely  with  alternative  designs,  and  shortening  the  "critical  path’  from 
conceptualization  to  implementation. 


L9l 


192 


CHAPTER  7.  CONCLUSION 


By  a  technique  of  induction  over  processes,  it  is  possible  to  prove 
properties  of  a  protocol  which  are  independent  of  the  number  of  pro¬ 
cesses  participating  in  the  protocol.  This  type  of  proof  requires  a  suf¬ 
ficient  understanding  of  the  protocol  on  the  part  of  the  designer  to 
construct  a  process  invariant.  Invariants  are  difficult  to  find,  but  the 
symbolic  model  checker  provides  an  aid  in  this  process  by  producing 
counterexamples  for  unsound  invariants.  In  the  author's  opinion,  find¬ 
ing  a  process  invariant  for  a  protocol  is  not  only  of  value  as  a  proof 
technique  --  the  understanding  of  the  protocol  required  to  formulate 
the  invariant  can  lead  to  simpler  and  more  elegant  protocols.  This  is 
another  reason  for  formalizing  aiul  verifying  a  protocol  before  attem])!- 
ing  to  implement  it. 

The  verification  technique  based  on  occurrence  nets  shows  that  OB- 
DDs  are  not  the  only  representation  that  can  be  used  to  avoid  the  state 
explosion  problem.  There  are.  in  fact,  certain  advantages  to  the  occur¬ 
rence  net  based  method  for  the  example  presented,  since  the  memory 
usage  is  small,  and  no  hei  ristic  technique  i.s  required  to  produce  a  vari¬ 
able  ordering.  Still,  at  this  stage,  the  occurrence  net  method  is  certainly 
not  as  well  advanced  as  the  symbolic  model  checking  method. 

There  are  several  areas  where  the  current  work  falls  short  of  the 
goal  of  complete  automatic  verihcalion  of  digital  systems.  In  the  ca.se 
of  the  Gigamax  protocol,  an  abstract  model  of  the  protocol  was  veri¬ 
fied  and  not  the  actual  implementation.  Verification  of  the  implemen¬ 
tation  would  have  been  impossible  due  to  a  lack  of  formal  models  of 
the  components  of  the  system  ( it.,  standard  devices,  such  as  memories, 
registers,  programmable  logic,  central  processing  units,  ftc.).  If  such 
models  were  available  from  the  manufacturers,  in  principle  the  methods 
described  in  chapter  -5  could  be  u.sed  to  show  that  the  implementation 
is  simulated  by  the  abstract  model.  Hierarchical  reasoning  of  this  kind 
has  been  extensively  studied  by  Kurshan  [KurST].  ITifortunatelv.  sim¬ 
ulation  does  not  preserve  existential  CTL  |)roperties  such  as  ab,senc(' 
of  deadlock.  .As  mentioned  previously,  bisimulation  equivalence,  whic  h 
preserves  all  CTL  properties,  is  too  strong  for  this  purpo.se.  since  the 
abstract  models  are  necessarilv  non-d('t('rminisl  ic.  anrl  the  actual  im¬ 
plementation  cannot  land  should  not)  exhibit  this  non-det<'rmimsm. 
.A  practical  technicpie  of  abstraction  which  pix'serves  existential  ( 'd  1. 
properties  is  needed  if  ex. stent ial  properties  are  to  l)e  provx'd  using 
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hierarchical  reasoning. 

There  is  also  a  need  for  heuristic  strategies  for  generating  process 
invariants  in  inductive  proofs.  Marelly  and  Grumberg  view  the  design 
of  the  invariant  as  part  of  the  design  of  the  protocol.  This  is  a  use¬ 
ful  point  of  view,  but  some  automated  help  beyond  the  generation  of 
counterexamples  would  be  useful  for  this  purpose. 

Finally,  this  work  concentrates  on  how  to  solve  the  verification  prob¬ 
lem,  once  it  has  been  formalized  as  the  satisfaction  of  a  temporal  logic 
formula  by  a  finite  model,  or  as  an  appropriate  relation  between  finite 
automata.  There  is,  of  course,  a  wide  range  of  issues  involved  in  for¬ 
malizing  the  problem  in  the  first  place.  For  example,  there  is  the  ever 
present  danger  that  the  specification  itself  is  incorrect.  In  the  case  of 
the  very  simple  CTL  formulas  used  to  specify  the  Gigamax  protocol, 
this  is  perhaps  not  a  severe  problem.  The  abstraction  that  was  used 
to  create  a  model  for  checking  the  sequential  consistency  property  was. 
however,  not  obviously  correct. 

In  general,  there  is  a  clear  need  for  complete  mechanical  checking 
that  the  implementation  of  a  processor  or  protocol  matches  the  in¬ 
tended  architecture  (user  model).  This  requires  first  of  all  a  definitive 
model  of  the  architecture  -  something  that  is  currently  lacking  even  for 
standardized  architectures  in  the  public  domain.  Second  there  must  be 
a  well  defined  criterion  for  determining  what  is  a  valid  implementation 
of  the  architecture.  Loosely,  an  implementation  of  a  processor  is  equiv¬ 
alent  to  an  architecture  model  if  for  all  programs,  the  two  machines 
produce  the  same  ■‘’answer".  However,  for  many  reeisons.  this  equiva¬ 
lence  cannot  be  directly  stated  in  terms  of  equivalence  of  finite  state 
machines.  For  one,  most  modern  CPU  architectures  have  no  explicitly 
defined  notion  of  input  and  output.  It  is  not  adequate  to  view  input 
and  output  as  the  sequence  of  loads  or  stores  observed  at  the  memory 
interface,  since  this  sequence  will  differ  among  implementations  (espe¬ 
cially  if  the  implementations  contain  cache  memories,  which  is  often  the 
case).  Solutions  to  the  formalization  problem  are  needed,  but  cannot 
be  obtained  by  studying  theoretical  models  alone.  It  is  necessary  to 
carefully  consider  what  verification  means  in  an  engineering  sense,  as 
well  as  a  mathematical  sense. 

Despite  the  shortcomings  of  current  verification  technology,  it  is 
clear  that  there  are  at  least  small  areas  of  the  problem  space  for  which 
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reasonable  solutions  exist,  and  these  solutions  can  be  put  into  practice 
to  positive  effect  in  an  industrial  setting.  Those  involved  in  verification 
research  should  perhaps  take  a  closer  look  at  engineering  practice  to 
determine  how  well  the  verification  solutions  match  up  with  real  engi¬ 
neering  problems.  This  effort  may  lead  not  only  to  a  more  practical 
theory  of  formal  verification,  but  also  to  a  rich  source  of  theoretical 
problems. 


Appendix  A 

Semantics  of  SMV.l 


This  appendix  defines  the  semantics  of  programs  in  the  language  SMV.l. 
which  includes  the  subset  SMV.O  plus  the  process  keyword.  In  SMV.l. 
we  need  to  account  for  both  the  arbitrary  interleaving  of  processes,  and 
the  rules  regarding  when  a  variable  may  change  value  as  a  result  of 
executing  a  given  process. 


A.l  The  model 

The  set  iV  of  names,  is  the  set  of  all  character  strings  made  up  of 
the  letters,  the  digits,  the  underscore  and  the  minus  sign  characters, 
beginning  with  a  letter.  The  store  L  =  Lv  U  Lh  is  made  up  of  two 
disjoint,  countably  infinite  sets  of  locations  Lv  and  Lh-  We  will  call 
the  former  the  visible  locations,  and  the  latter  the  hidden  locations. 
The  set  of  locations  L  is  defined  recursively.  It  is  the  least  set  such 
that 

1.  if  n  €  iV,  then  n  €  Lv,  and 

2.  if  I  €  Lv  and  n  €  N,  then  l.n  €  Lv,  and 

3.  if  I  €  Lv,  then  ./  €  Lh- 

The  set  of  values  V  is  the  union  of  the  integers  in  the  range  [— 2^^  2^*  —  1] 
and  iV,  the  set  of  names.  .-X  state  x  :  L  V  is  a.  function  from  locations 
to  values.  Let  S  =  L  V  be  the  set  of  all  possible  states. 
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If  p  is  a  declaration,  then  its  denotation  Jpj  is  a  quadruple  {T,  I,  R,C). 
The  T  component  is  a  partial  function  from  L  to  the  finite  subsets  of 
V'.  If  /  is  a  location,  then  T{1),  when  defined,  is  the  type  of  /  -  the  set 
of  values  that  can  be  assigned  to  location  /.  The  component  I  C  S  \s 
the  set  of  initial  states.  The  component  R  C  S  x  S  is  the  transition 
relation.  An  asynchronous  process  is  identified  with  a  location  r.  which 
has  the  value  1  in  a  given  state  e.xactly  when  the  process  is  executing 
in  that  state.  The  component  C  C  L  x  L  is  the  set  of  pairs  (r.l)  such 
that  process  r  assigns  the  next  value  of  location  t. 


A. 2  Expressions 

An  expression  denotes  a  function  from  states  to  finite  subsets  of 
according  to  the  following  rules: 

1.  If  u  is  a  value,  then  =  {r}. 

2.  If  /  is  a  location,  then  |/|(.r)  =  {.r(/)}. 

3.  If  Cl,  62  are  expressions,  and  o  is  one  of 

+.  -.  *.  /.  mod.  >.  >=.  <.  <*.  =.  &.  I.  ->.  <-> 

then 

fc)  o  e2j(.r)  =  {|o|(r|.r2)  j  c,  G  |tij(-c).  i'2  G 

4.  If  6  is  an  expression,  then 

.  I!e](.r)  =  {I!|((’)  i  (•  |fi|(.f)} 

o.  If  61,62  are  expressions, 

|[6i  union  f,.!!-'’)  =  ii'  i|'J 
6.  If  61.62  are  expressions, 

Iti  in  cilb'-i  =  If  1]  ^  if  jl 


A.3.  ASSIGNMENTS  AND  DEFINITIONS 
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The  functions  denoted  by  +,  *,  /  are  the  usual  functions  of  arith¬ 

metic  modulo  2^^.  The  function  denoted  by  mod  is  the  positive  re¬ 
mainder  of  division  mod  2^^.  The  function  denoted  by  the  relational 
operators  >,  >=,  <  and  <=  return  0  when  the  relation  is  false  and  1  when 
the  relation  is  true,  and  are  defined  for  numeric  values  only.  For  non¬ 
numeric  values,  they  return  J_.  The  equality  operator  =  is  defined  for 
all  values,  and  returns  0  when  they  are  unequal,  and  1  when  they  are 
equal.  The  functions  denoted  by  the  Boolean  operators  are&  (for  and). 
I  (for  or),  !  (for  not),  ->  (for  implies)  and  <->  (for  logical  equivalence) 
are  defined  only  for  the  values  0  and  1,  and  return  J.  otherwise. 


A.3  Assignments  and  definitions 

There  is  no  semantic  difference  between  assignments  and  definitions. 
If  /  is  a  location,  and  e  is  an  expression,  then  the  assignment  /  :=  e: 
denotes  a  quadruple  (T, /,  R.C),  where 

1.  r  =  0 

2.  I  =  S 

;l.  R  =  {{x.y)  €  S'^  1  l(x)  €  e(.r)} 

4.  C  =  (« 

The  assignment  next(/)  :=  t:  denotes  a  triple  (T,  7.  R)  where 

1.  r  =  (0 

2.  7  =  .s' 

R  =  {(.c,y)  G  .s'-  i  .r(running)  =>  l-iy)  G  e(j:)} 

4.  C  =  {(running,/)} 

The  assignment  init(/)  ;=  e:  denotes  a  triple  (T,  7.  R)  where 

1.  T  =  0 

2.  7  =  {,c  G  5  I  /(.r)  G  ^(.r)} 

R  =  .s"^ 

1.  C  = 
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A.4  Variable  declarations 

If  /  is  an  identifier  and  ui,  wj?  •  •  •  i  are  values,  then 

VAR  /  ;  {ui,V2,--  <yn}: 
denotes  a  quadruple  (T,  /  ,  C)  where 

1.  r  =  {(/,  {Ui,U2-.-,Un})} 

2.  I  =  {x  £  S  \  .r(/)  €  {ui,i’2 . «'n})} 

3.  R  =  {(x,!/)  €  5  1  x(l),y(l)  €  {ui.1’2 . f’n})} 

4.  C  =  0 

A.  5  Renaming 

Let  0  :  L  L  he  A  function  from  locations  to  locations.  This  in  turn 
induces  a  map  $  on  states,  such  that  for  all  states  x  and  locations  1. 

<tf(x)(l)  =  x(o(l)). 

If  M  =  (T,  I,  R,  C).  then  let  o{\I)  =  {T.  /'.  R'.  C)  where 

1.  r{<f>(i))  =  T{i). 

2.  I'  =  {.c  1  <&(.r)  €  /}  and 

3.  R'  =  {(x,//)  I  ($(x),^>(j/))  €  R}. 

4.  r  =  {((Z>(r).<p(/))  I  (r./)  €  C}. 

Note  that  the  definition  of  T  does  not  make  sense  if  o  maps  two  loen- 
tions  with  different  types  onto  the  same  location.  In  this  case.  o(  .1/)  is 
a  type  error. 


A.6.  PARALLEL  COMPOSITION 
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A. 6  Parallel  composition 

Let  Ml  =  {Ti,  Ii,  RitCi)  and  M2  =  Let  ni  and  112  be 

two  distinct  names.  For  i  €  1,2,  let  =  -ni.l  for  all  /  €  Lh  and 
<pi{l)  =  /  otherwise,  and  let  M'  =  (t>{Mi).  The  parallel  composition 
M  =  Ml  II  M2  is  defined  as  follows: 

1.  T  =  T{  U 

2.  /  =  /;  n  /' 

3.  R  =  R;  O  R'2 

4.  c  =  c[  u  a. 

If  dl.dii _ dk  are  declarations,  then  |(/i  ...  (in;|  is  the  parallel 

composition 

M.i  II  hhi  II  •••II  m 


A.  7  Instantiation 

Suppose  that  module  A  is  defined  as  follows: 

MODULE  .4(nv.fii,....nfc)  D 

where  rii,  ^2, _ are  distinct  names  and  Z?  is  a  sequence  of  declara¬ 
tions.  VVe  first  consider  the  variable  declaration  VAR  r  ;  A{l\,l2 . 

where  r. /i ,  . . . . /a-  are  visible  locations.  Let  0  be  a  renaming,  such 

that,  for  all  /  €  Lv. 

1.  for  all  I  ■<  i  <  fc:  0(n,  j  =  and  <z>(n,./)  = 

•2.  0{.l)  =  ./ 

.3.  for  all  n  G  .V —  { running,  n i • ''2 . =  r.n.  and  o(n./)  = 

r.n.l. 

4.  (i»( running)  =  running,  o(  running./)  =  running./ 


200 


APPENDIX  A.  SEMANTICS  OF  SMV.l 


ThenJVARr  :  /a, . . .  ,4);  J  =  ^(D). 

Now  we  consider  the  declaration 

VAR  r  :  process /fc); 

Let  (f>  he  a.  renaming,  such  that,  for  all  I  ^  Lv, 

1.  for  all  1  <  Z  <  k:  (p{ni)  =  and 

2.  for  all  n  €  ZV  —  {/ij,  na, . . . ,  n*},  ©(n)  =  r.n.  and  <i>(n.l)  = 

3.  <Z)(./)  =  .Z 

Then  |VAR  r  ;  process  /a, _ |  =  ©(Z)). 

A.8  Programs  and  interleaving 

Suppose  that  module  main  is  defined  as  follows: 

MODULE  main  D 

where  D  is  a  sequence  of  declarations  and  |£)]  =  (T,  I .  R.C).  The 
number  of  processes  executing  in  state  x  is 

ne(x)  =  |{r|(i(r)  =  1)  A  3Z  :  (r.l)  €  C}|. 

The  set  of  legal  interleaving  stales  is 

S[  =  {j-  €  S  j  =  1 }. 

The  set  of  states  in  which  location  /  is  constrained  to  remain  unchanged 
is 

U{1)  =  {x  e  S  \  [3r  :  (r.  /)  €  C'|  A  --Sr  :  [(r.  /)  €  C  A  (x(r)  =  1)]} 
The  denotation  of  the  program  is  a  triple  (T.  I'.  R').  where 

1.  r  =  InSi. 

2.  R'  =  {(x,!j)  €  R  \  X  e  Si  A'il  e  Z  :  (j-  •€('(/)  =>  x(l]  =  //(/))}. 


A.9.  SPECIFICATIONS 
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A.9  Specifications 

Each  program  is  associated  with  a  Kripke  structure  which  determines 
the  truth  value  of  CTL  formulas  in  the  specification.  The  atomic  propo¬ 
sitions  in  this  Ccise  are  all  the  Boolean  valued  expressions.  The  Kripke 
structure  associated  with  a  program  whose  denotation  is  the  quadruple 
{T,  I,  R)  is  a  Kripke  model  A’  =  (.S'.  R,  L')  where 

1.  6'  is  the  set  of  states  defined  above. 

2.  R  is  the  transition  relation,  and 

3.  if  e  is  an  expression,  then 

L'ie)  =  {.(•  €  S'  i  Iei(.c)  =  (!}} 

The  specification  is  a  formula  /  in  CTL  vvith  fairness  constraints.  It  is 
satisfied  exactly  when  A',  sq  f  for  all  Sq  €  /. 
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